HW 04 - Pursuit-evasion in Robotics

The main task of the homework is to implement an advanced player policy for pursuit-evasion game.

Deadline	14. January 2018, 23:59 PDT
Label in BRUTE	HW04
Files to submit	archive with `.py` files, mandatory file: `player.py` Do not upload `eval.py` and `game.py` files Do not upload optional `pacman.policy` file to BRUTE, send it via https://filesender.cesnet.cz instead.
Mandatory tasks	10 Points
Resources	lab11 resource files

Mandatory tasks - Modify the `player.py` script to implement a player policy for pursuit-evasion game:
10 points	Within the provided game simulation framework implement one of advanced pursuit evasion strategies. Choose between: - Value-iteration policy - Monte-Carlo tree search policy
Additional remarks
Consider robot movement only in 4-neighborhood
Consider the game with one evader and two pursuers
For Value-iteration policy you can precompute the optimal policy and save it to the file `mazes/pacman.policy`. You can load the file, within the `player.py` class if it fits the particular scenario. The file has to be smaller than 200~MB and shared using the https://filesender.cesnet.cz with lab teachers. Note, the implementation of the strategy has to work without the precomputed policy file, i.e., if the file with correct policy is not present, calculate the policy.
For Monte-Carlo tree search you are given a time for making the decision for a single robot. This time is defined by `self.timeout` constant within the `player.py` class. A possible usage in the code might look like: def monte_carlo_policy(self, grid_map, evaders, pursuers): self.next_robots = self.robots[:] #for each robot plan actions for idx in range(0, len(self.robots)): clk = time.time() while (time.time() - clk) < self.timeout: #monte-carlo tree search #make decision on where to go based on tree search self.next_robots[idx] = pos_selected
Single strategy implementation is expected. Therefore, additional points ( 5 points ) will be awarded if both strategies are implemented.

Evaluation - how will be the homeworks evaluated

The assignment will be evaluated within the provided game simulation framework.

Testing in pacman.game scenario - submitted solution will be tested on the full pacman scenario - the testing scenario is parametrized in mazes/pacman.game and has the following settings:
1. For Value-iteration policy:
```
EVADER VALUE_ITERATION ro 1 1
PURSUER VALUE_ITERATION bo 3 3 17 19
```
  The precomputed policy can be saved in mazes/pacman.policy and submitted as described above
2. For Monte-Carlo tree search policy:
```
EVADER MONTE_CARLO ro 1 1
PURSUER MONTE_CARLO bo 3 3 17 19
```
  The time for a single robot decision self.timeout is set to 5s
Testing on a small maze - submitted solutions will be tested within a small maze, e.g., grid4x4 grid map, to evaluate that the strategy is actually calculated for the particular scenario