====== Lab11 - Game Theory in Robotics ======

^ Motivations and Goals  ^
| Become familiar with pursuit-evasion scenario |
| Be able to establish a greedy policy for pursuit-evasion problem |
^ Tasks ([[courses:b4m36uir:internal:instructions:lab11|teacher]]) ^
| Implement and verify the functionality of greedy policy for pursuit-evasion problem |
^ Lab resources  ^
| Lab sripts: {{ :courses:b4m36uir:hw:task11.zip |Task11 resource files}} |


===== Pursuit Evasion =====
A problem in computer science where a set of pursuers is trying to catch a set of evaders.\\
{{:courses:b4m36uir:labs:pursuit-evaision.png?200|}}


==== Greedy Policy ====
In greedy policy the next-best state is selected in each discrete step of the game simulation without considering longer prediction horizon.\\
Usual policies incorporate distances between individual agents as follows:\\ 
  * For evaders select the most distant node to the closest pursuer
  * For pursuers select the closest node to the closest evader

==== Limitations of Greedy Policy ====
Find counterexamples where greedy policy will fail.
  * Based on euclidean distance
  * Based on shortest paths distance

==== Monte Carlo Tree Search ====
  * MCTS (https://en.wikipedia.org/wiki/Monte_Carlo_tree_search)
    * Implementation by tree vs. hash map
  * UCB1
  * Dealing with simultaneous moves (http://storage.kghost.de/cig_proc/full/paper_61.pdf)
    * serialized UCT
    * decoupled UCT (need for randomization)
  * MCTS improvements (RAVE)