Iterated Prisoner's Dilemma

Iterated Prisoner's Dilemma

Prisoner's dilemma (PD) and its iterated variant (IPD) is a well-known problem in game theory. The goal of this project is to create and compare optimization algorithms that find a good player for a particular IPD problem instance.

An IPD problem instance is given by

the payoff matrix and
the number of iterations.

Payoff Matrix (PM)

The payoff matrix must be set up such that the game is fair for both players. For 4 positive numbers a, b, c, and d, the matrix has the following form:

		P2 plays
		0	1
P1 plays	0	(A,A)	(B,C)
	1	(C,B)	(D,D)

where in each pair (r1, r2) denotes the rewards for player 1 and player 2, respectively.

Depending on relations among A, B, C, and D, the payoff matrix may be set up in such a way that

there is a proper dilemma,
no dilemma is present,
there is no dominant strategy,
the best strategy require the decisions of the players to alternate,
etc.

You should choose several interesting PM setups and try to find players for them.

Representation

There are many possible representations of the players/strategies:

finite state machine (that can be encoded in a binary string),
set of rules,
decision tree,
neural network,
…

The chosen representation shall be consulted with the teacher.

Evaluation

IPD is a two-player game. To evaluate the quality of a player is thus possible only by playing games against other players, i.e., the quality of a player depends on the set of chosen opponents. The choice of this reference set is up to you, it may be

a static set of predefined players,
the other players in population,
a combination of the above options, …

The goal is to *maximize the sum of rewards across all played IPD games*.

Interesting questions

In no particular order:

What is the strategy used by the best player for individual PM setups?
How does the initialization affect the resulting best player? What if we seed the population with a specified number of copies of known strategies?
Is it possible to evolve a player, that plays well for any setup of PM?
Is it possible to evolve a player, that plays well against any opponent?

Take these questions as an inspiration of what you can study in your project.

Table of Contents

Iterated Prisoner's Dilemma

Payoff Matrix (PM)

Representation

Evaluation

Interesting questions