Search
We consider the following problem: your agent is placed on an island with a map. The goal of the agent is to reach the destination square on the island while collecting as many gold treasures as possible while stepping at each square at most once. There are bandits on the island that want to ambush the agent and steal the collected gold. The bandits may be placed at specific squares (termed “dangerous places”, marked as “E” in the map); there can be at most one bandit at one dangerous place. Your agent only knows the overall number of bandits, however, he does not know the exact places where they are waiting. The bandits want to harm your agent as much as possible by deciding at which dangerous places they are going to wait. When your agent visits a place with a bandit, an attack occurs. The attack is successful with some probability p (the agent will lose all the gold and the game ends). Otherwise, the attack was unsuccessful and your agent can continue. Your agent can react to the information gained during the course of the game – e.g., in the case of an unsuccessful attack, the agent can exploit this information (there is one less bandit in the remaining dangerous places).
p
The bandits primarily decide their position at the beginning of the game. Moreover, the bandits can use sensors. If your agent enters an unoccupied dangerous place, an alarm is triggered and 1 bandit can relocate to another empty dangerous place. However, this can happen at most once. The following holds:
The semestral project is split in two parts:
For the first part, we prepared a template that you should implement:
The template contains code to export the EFG into a format that can be subsequently parsed and loaded by Gambit (see below). You can check in the visual viewer if the game has been modelled as you aimed.
Automatic evaluation of the constructed trees is non-trivial, and you may receive points for invalid solutions. The score in the upload system therefore serves as only an upper bound of the final score. We will revise your submissions manually, and in case we find some errors, points will be deducted.
For the second part, we prepared a template that you should implement:
You can use following LP solvers: cvxopt gurobipy
To setup gurobi locally, there is a great tutorial in the Combinatorial Optimization course: https://cw.fel.cvut.cz/b182/courses/ko/labs
If you want to use other LP solver that is available as Python package, consult with me first. (It has to be added to automatic evaluation system.)
You should consult the manual of your LP solver to figure out how to input your sequence-form LP and solve for value of the game and player strategies.
Submit your solutions to upload system.
For the first task:
The assignment is to implement the program for solving described games. The program reads input in following format:
number: number of rows of the maze (M) number: number of columns of the maze (N) M rows, N symbols representing squares of the maze number: the overall number of the bandits float number: the probability of a successful attack
Squares of the maze are defined as follows:
# - obstacle - - empty square S - start of your agent D - destination of your agent G - gold treasure E - dangerous place
Constants for Utility Calculation: Your agent receives 10 points for reaching the destination and he receives 1 additional point for each collected gold treasure. If a bandit successfully attacks the agent, your agent receives 0 points no matter how many gold treasures he has been carrying before the attack. If your agent follows a path that does not end in the destination square (e.g., if it leads to a dead end), your agent receives 0 points.
Input Example:
7 9 ######### #E----EG# #-##-##E# #S##-##-# #-##-##-# #E-----D# ######### 2 0.5
For the second task:
The input starts the same as in the first task, and there is one additional row that specifies index of the player (0 or 1).
The output must be a valid gambit file, as defined by the `export_gambit` function in the python template.
The output of your program is the expected utility in the root node for the given player.
game_tree.py
game_lp.py
The value of the game must be exactly the same (modulo double arithmetic imprecision). If the difference is larger (e.g., >0.001), your solution is still incorrect!
7 9 ######### #G-----E# #-#####-# #S--E--D# #-#####-# #G-----E# ######### 2 0.5
7.096774193548387
7 9 ######### #G-E---E# #-#####-# #S-E--ED# #-#####-# #G-E---E# ######### 2 0.6
3.4064516129032265
5.5
5 7 ####### #E---E# #S#-#D# #E--EG# ####### 1 0.7
5.054761904761906
Tip: Actions are ordered in the sense that that for two histories h,g in the same infoset it holds that h.actions() == g.actions(). You can use action's index in the list for its identification. Do not use str(action) or some other way of encoding the action based on your previous implementation.
To identify sequences it is not enough to use the list of action indices! Consider simple poker: first player can fold or bet. Folding in one Infoset_1 (player has card J) is however not the same thing as folding in Infoset_2 (player has card Q)!
In class, we identified folding as f1 and f2, based on the fact from which infoset they came from.
Also, do not rely on Action being able to be used as Dict key as it does not have `hash()` function implemented.
Installing Gambit is not necessary for the completion of your assignments, but you might find it useful for debugging.
The gambit project is located at http://www.gambit-project.org/
Installation of the version published at the website did not work for me, but building manually from sources did (Ubuntu 19.04):
git clone git://github.com/gambitproject/gambit.git cd gambit
Update 2020-11-11: If you run into some linking problems you may want to use the forked repo. (thanks Dmitry!)
After this, you will need to set up the build scripts by executing ::
aclocal libtoolize automake --add-missing autoconf
You will need to install gui support libraries: libwxgtk >= 2.8.0
sudo apt-get install libwxgtk3.0-gtk3-dev sudo apt-get install libwxgtk3.0-gtk3-0v5
Then
./configure make -j 8 # to compile faster using 8 cores sudo make install
You can launch gambit!
gambit
You can generate a tree, and visualize it in gambit:
python game_tree.py < maze1.txt > maze1.efg gambit maze1.efg