Table of Contents

07 Sequential II

Sequential decisions and how do we calculate the proper policy?

Quiz

Policy evaluation on a small map.

MDP

Let's have the following game. We roll the dice and pay 1kc for each roll. If we roll six two times in a row, we win 1000CZK and the game is over.
The game can be terminated at any time without payment.
1) Formulate as MDP task (states, actions, T (s, a, s '), r (s, a, s')).
2) Determine the optimal policy.

Individual task

Work with the Markov decision process task. .