====== Lab11 - Game Theory in Robotics ====== ^ Motivations and Goals ^ | Become familiar with pursuit-evasion scenario | | Be able to establish a greedy policy for pursuit-evasion problem | ^ Tasks ([[courses:b4m36uir:internal:instructions:lab11|teacher]]) ^ | Implement and verify the functionality of greedy policy for pursuit-evasion problem | ^ Lab resources ^ | Lab sripts: {{ :courses:b4m36uir:hw:task11.zip |Task11 resource files}} | ===== Pursuit Evasion ===== A problem in computer science where a set of pursuers is trying to catch a set of evaders.\\ {{:courses:b4m36uir:labs:pursuit-evaision.png?200|}} ==== Greedy Policy ==== In greedy policy the next-best state is selected in each discrete step of the game simulation without considering longer prediction horizon.\\ Usual policies incorporate distances between individual agents as follows:\\ * For evaders select the most distant node to the closest pursuer * For pursuers select the closest node to the closest evader ==== Limitations of Greedy Policy ==== Find counterexamples where greedy policy will fail. * Based on euclidean distance * Based on shortest paths distance ==== Monte Carlo Tree Search ==== * MCTS (https://en.wikipedia.org/wiki/Monte_Carlo_tree_search) * Implementation by tree vs. hash map * UCB1 * Dealing with simultaneous moves (http://storage.kghost.de/cig_proc/full/paper_61.pdf) * serialized UCT * decoupled UCT (need for randomization) * MCTS improvements (RAVE)