Warning
This page is located in archive. Go to the latest version of this course pages. Go the latest version of this page.

09 Reinforcement Learning II

How not to repeat yourself. We've already found a way, but maybe there's a better place somewhere. Aka explotation vs. exploration.

Quiz for bonus points

  • Calculate Q values from training episodes using temporale difference method
  • 0.5 points
  • submit your solution to BRUTE lab09quiz by April 27, midnight
  • format: text file, photo of your solution on paper, pdf - what is convenient for you
  • solution will be discussed on the next lab
  • quiz assignment: [will be accessible from Monday, April 27]

Quiz II / Solving together during interactive lab

Effect of discount factor on policy. Lab exercise on discount factors (pdf)

Individual Work

Reinforcement learning plus

Reinforecement learning is now a very active area, also supported by rapid progress in deep neural network learning. A few links for further inspiration:

courses/be5b33kui/labs/weekly/week_09.txt · Last modified: 2020/04/29 12:10 by hoffmmat