The main task of the homework is to implement an advanced player policy for pursuit-evasion game.
Deadline | 14. January 2018, 23:59 PDT |
---|---|
Label in BRUTE | HW04 |
Files to submit | archive with .py files, mandatory file: player.py Do not upload eval.py and game.py filesDo not upload optional pacman.policy file to BRUTE, send it via https://filesender.cesnet.cz instead. |
Mandatory tasks | 10 Points |
Resources | lab11 resource files |
Mandatory tasks - Modify the player.py script to implement a player policy for pursuit-evasion game: | ||
---|---|---|
10 points | Within the provided game simulation framework implement one of advanced pursuit evasion strategies. Choose between: - Value-iteration policy - Monte-Carlo tree search policy | |
Additional remarks | ||
Consider robot movement only in 4-neighborhood | ||
Consider the game with one evader and two pursuers | ||
For Value-iteration policy you can precompute the optimal policy and save it to the file mazes/pacman.policy . You can load the file, within the player.py class if it fits the particular scenario. The file has to be smaller than 200~MB and shared using the https://filesender.cesnet.cz with lab teachers. Note, the implementation of the strategy has to work without the precomputed policy file, i.e., if the file with correct policy is not present, calculate the policy. |
||
For Monte-Carlo tree search you are given a time for making the decision for a single robot. This time is defined by self.timeout constant within the player.py class. A possible usage in the code might look like:def monte_carlo_policy(self, grid_map, evaders, pursuers): self.next_robots = self.robots[:] #for each robot plan actions for idx in range(0, len(self.robots)): clk = time.time() while (time.time() - clk) < self.timeout: #monte-carlo tree search #make decision on where to go based on tree search self.next_robots[idx] = pos_selected |
||
Single strategy implementation is expected. Therefore, additional points ( 5 points ) will be awarded if both strategies are implemented. |
The assignment will be evaluated within the provided game simulation framework.
pacman.game
scenario - submitted solution will be tested on the full pacman scenario - the testing scenario is parametrized in mazes/pacman.game
and has the following settings:
EVADER VALUE_ITERATION ro 1 1 PURSUER VALUE_ITERATION bo 3 3 17 19The precomputed policy can be saved in
mazes/pacman.policy
and submitted as described above
EVADER MONTE_CARLO ro 1 1 PURSUER MONTE_CARLO bo 3 3 17 19The time for a single robot decision
self.timeout
is set to 5s
grid4x4
grid map, to evaluate that the strategy is actually calculated for the particular scenario