Warning
This page is located in archive. Go to the latest version of this course pages.

HW 04 - Pursuit-evasion in Robotics

The main task of the homework is to implement an advanced player policy for pursuit-evasion game.

Deadline 14. January 2018, 23:59 PDT
Label in BRUTE HW04
Files to submit archive with .py files, mandatory file: player.py
Do not upload eval.py and game.py files
Do not upload optional pacman.policy file to BRUTE, send it via https://filesender.cesnet.cz instead.
Mandatory tasks 10 Points
Resources lab11 resource files
Mandatory tasks - Modify the player.py script to implement a player policy for pursuit-evasion game:
10 points Within the provided game simulation framework implement one of advanced pursuit evasion strategies. Choose between:
- Value-iteration policy
- Monte-Carlo tree search policy
Additional remarks
Consider robot movement only in 4-neighborhood
Consider the game with one evader and two pursuers
For Value-iteration policy you can precompute the optimal policy and save it to the file mazes/pacman.policy. You can load the file, within the player.py class if it fits the particular scenario. The file has to be smaller than 200~MB and shared using the https://filesender.cesnet.cz with lab teachers. Note, the implementation of the strategy has to work without the precomputed policy file, i.e., if the file with correct policy is not present, calculate the policy.
For Monte-Carlo tree search you are given a time for making the decision for a single robot. This time is defined by self.timeout constant within the player.py class. A possible usage in the code might look like:
def monte_carlo_policy(self, grid_map, evaders, pursuers):
    self.next_robots = self.robots[:]
 
    #for each robot plan actions
    for idx in range(0, len(self.robots)):
	clk = time.time()
        while (time.time() - clk) < self.timeout:
            #monte-carlo tree search
        #make decision on where to go based on tree search
        self.next_robots[idx] = pos_selected
Single strategy implementation is expected. Therefore, additional points ( 5 points ) will be awarded if both strategies are implemented.

Evaluation - how will be the homeworks evaluated

The assignment will be evaluated within the provided game simulation framework.

  1. Testing in pacman.game scenario - submitted solution will be tested on the full pacman scenario - the testing scenario is parametrized in mazes/pacman.game and has the following settings:
    1. For Value-iteration policy:
      EVADER VALUE_ITERATION ro 1 1
      PURSUER VALUE_ITERATION bo 3 3 17 19
      The precomputed policy can be saved in mazes/pacman.policy and submitted as described above
    2. For Monte-Carlo tree search policy:
      EVADER MONTE_CARLO ro 1 1
      PURSUER MONTE_CARLO bo 3 3 17 19
      The time for a single robot decision self.timeout is set to 5s
  2. Testing on a small maze - submitted solutions will be tested within a small maze, e.g., grid4x4 grid map, to evaluate that the strategy is actually calculated for the particular scenario
courses/b4m36uir/hw/hw04.txt · Last modified: 2017/12/18 17:35 by cizekpe6