====== 09 Reinforcement Learning II ======

How not to repeat yourself. We've already found a way, but maybe there's a better place somewhere.

===== Quiz =====


Traditional quiz, this time to calculate Q values from training episodes using time difference method

===== Individual Work: next assignment =====

Work on the [[courses:be5b33kui:labs:rl:start|Reinforcement learning assignment]].

===== Reinforcement learning plus =====

Reinforecement learning is now a very active area, also supported by rapid progress in deep neural network learning. A few links for further inspiration:

  * [[https://www.youtube.com/watch?v=SH3bADiB7uQ|Table tennis robot player]]. Starting from imitation, then generalizing through RL.
  * [[https://research.google.com/teams/brain/robotics/|Robotics@google]]. Well, they can afford many learning episodes many iterations ;-)
  * [[https://medium.com/@dhruvp/how-to-write-a-neural-network-to-play-pong-from-scratch-956b57d4f6e0|Pong game]]. Learning to play the very old computer game with the help of AI-Gym. [[https://www.youtube.com/watch?time_continue=6&v=YOW8m2YGtRg|YT Video]]