====== Iterated Prisoner's Dilemma ====== [[https://en.wikipedia.org/wiki/Prisoner%27s_dilemma|Prisoner's dilemma]] (PD) and its iterated variant (IPD) is a well-known problem in game theory. The goal of this project is to create and compare optimization algorithms that find a good player for a particular IPD problem instance. An IPD problem instance is given by * the payoff matrix and * the number of iterations. ===== Payoff Matrix (PM) ===== The payoff matrix must be set up such that the game is fair for both players. For 4 positive numbers a, b, c, and d, the matrix has the following form: | || P2 plays || | || 0 | 1 | | P1 plays | 0 | (A,A) | (B,C) | | | 1 | (C,B) | (D,D) | where in each pair (r1, r2) denotes the rewards for player 1 and player 2, respectively. Depending on relations among A, B, C, and D, the payoff matrix may be set up in such a way that * there is a proper dilemma, * no dilemma is present, * there is no dominant strategy, * the best strategy require the decisions of the players to alternate, * etc. You should choose several interesting PM setups and try to find players for them. ===== Representation ===== There are many possible representations of the players/strategies: * finite state machine (that can be encoded in a binary string), * set of rules, * decision tree, * neural network, * ... The chosen representation shall be consulted with the teacher. ===== Evaluation ===== IPD is a two-player game. To evaluate the quality of a player is thus possible only by playing games against other players, i.e., the quality of a player depends on the set of chosen opponents. The choice of this reference set is up to you, it may be * a static set of predefined players, * the other players in population, * a combination of the above options, ... The goal is to *maximize the sum of rewards across all played IPD games*. ===== Interesting questions ===== In no particular order: * What is the strategy used by the best player for individual PM setups? * How does the initialization affect the resulting best player? What if we seed the population with a specified number of copies of known strategies? * Is it possible to evolve a player, that plays well for any setup of PM? * Is it possible to evolve a player, that plays well against any opponent? Take these questions as an inspiration of what you can study in your project.