====== HW 04 - Pursuit-evasion in Robotics ======
The main task of the homework is to implement an advanced player policy for pursuit-evasion game.

^ **Deadline** |  <wrap important>14. January 2018, 23:59 PDT</wrap> ^
^ **Label in BRUTE** |  HW04^
^ **Files to submit** |  archive with ''.py'' files, mandatory file: ''player.py''\\ <fc #ff0000>Do not upload ''eval.py'' and ''game.py'' files</fc>\\ <fc #ff0000>Do not upload optional ''pacman.policy'' file to BRUTE, send it via [[https://filesender.cesnet.cz]] instead.</fc>^
^ **Mandatory tasks** |  **10** Points ^
^ **Resources** |  {{:courses:b4m36uir:labs:pursuit_evaision.zip| lab11 resource files}} ^

^ <fc #ff0000>Mandatory tasks</fc> - Modify the ''player.py'' script to implement a player policy for pursuit-evasion game: |^ 
^ **10** points | Within the provided game simulation framework implement one of advanced pursuit evasion strategies. Choose between:\\  - **Value-iteration policy**\\  - **Monte-Carlo tree search policy** |
^ <fc #ff0000>Additional remarks</fc> |^
^| Consider robot movement only in 4-neighborhood ||
^| Consider the game with one evader and two pursuers ||
^| For **Value-iteration** policy you can precompute the optimal policy and save it to the file ''mazes/pacman.policy''. You can load the file, within the ''player.py'' class if it fits the particular scenario. The file has to be smaller than 200~MB and shared using the [[https://filesender.cesnet.cz]] with lab teachers. Note, the implementation of the strategy has to work without the precomputed policy file, i.e., if the file with correct policy is not present, calculate the policy. ||
^| For **Monte-Carlo tree search** you are given a time for making the decision for a single robot. This time is defined by ''self.timeout'' constant within the ''player.py'' class. A possible usage in the code might look like:<code python>
def monte_carlo_policy(self, grid_map, evaders, pursuers):
    self.next_robots = self.robots[:]
		
    #for each robot plan actions
    for idx in range(0, len(self.robots)):
	clk = time.time()
        while (time.time() - clk) < self.timeout:
            #monte-carlo tree search
        #make decision on where to go based on tree search
        self.next_robots[idx] = pos_selected
</code>||
^| Single strategy implementation is expected. Therefore, additional points (** 5 points **) will be awarded if both strategies are implemented. ||


=== Evaluation - how will be the homeworks evaluated ===
The assignment will be evaluated within the provided game simulation framework.\\
  - **Testing in ''pacman.game'' scenario** - submitted solution will be tested on the full pacman scenario - the testing scenario is parametrized in ''mazes/pacman.game'' and has the following settings:
      -  For **Value-iteration** policy:<code>EVADER VALUE_ITERATION ro 1 1
PURSUER VALUE_ITERATION bo 3 3 17 19</code> The precomputed policy can be saved in ''mazes/pacman.policy'' and submitted as described above
      - For **Monte-Carlo tree search** policy:<code>EVADER MONTE_CARLO ro 1 1
PURSUER MONTE_CARLO bo 3 3 17 19
</code> The time for a single robot decision ''self.timeout'' is set to 5s
  - **Testing on a small maze** - submitted solutions will be tested within a small maze, e.g., ''grid4x4'' grid map, to evaluate that the strategy is actually calculated for the particular scenario