Search
How not to repeat yourself. We've already found a way, but maybe there's a better place somewhere.
Traditional quiz, this time to calculate Q values from training episodes using time difference method
Work on the Reinforcement learning assignment.
Reinforecement learning is now a very active area, also supported by rapid progress in deep neural network learning. A few links for further inspiration: