Search
This task specification was created by one of your older schoolmates.
Description of Frozen Lake environment at AI Gym.
You are given a 2D map (8×8 or 4×4) with the starting and goal position, ice and holes. Your task is to evolve such a control strategy that allows you to get from start to the goal position as quickly as possible without falling into the cold water through a hole in ice. You can move up, down, left or right. The ice can be
The simulation ends if you get to the goal position, or if you step in a hole in ice. The evaluation you get from AI Gym is
For the slippery ice you can execute several runs and estimate the probability you will get to the goal. For non-slippery ice you will get 0 if you do not reach the goal position, no matter how close you get. (Maybe it could be changed to 1/x where x is Manhattan distance to the goal position?)
You should evolve/search for the strategies. You need to represent the strategy somehow. * 4×4/8×8 matrix/vector representing the strategy (policy): the value in each cell tells us where should I go if I get to that position. * Matrix/vector representing a “profit” you get by stepping to that position (how good it is to go to the goal via that position). The decision in each cell is then given by the direction where the most profitable neighbor lies.