Search
We don't know the model of the robot-agent; it's behaving somewhat strangely, the path to the goal is unknown, with some traps along the way: what do we do?
Traditional quiz: calculate Q values from training episodes using direct evaluation.
Finish the Markov decision process assignment: deadline at the end of the week.
Start working on the Reinforcement Learning assignment, deadline on BRUTE.