Warning
This page is located in archive. Go to the latest version of this course pages. Go the latest version of this page.

09 Reinforcement Learning II

How not to repeat yourself. We've already found a way, but maybe there's a better place somewhere. Aka explotation vs. exploration.

Quiz for bonus points

  • Calculate Q values from training episodes using temporale difference method
  • 0.5 points
  • submit your solution to BRUTE lab09quiz by April 19, midnight
  • format: text file, photo of your solution on paper, pdf - what is convenient for you
  • solution will be discussed on the next lab
  • quiz assignment: [Students with their family name starting from A to L (included) have to solve and upload subject A , while students with family name from M to Z have to solve and upload subject B]

Quiz II / Solving together during interactive lab

Effect of discount factor on policy. See pdf

Individual Work

Reinforcement learning plus

Reinforecement learning is now a very active area, also supported by rapid progress in deep neural network learning. A few links for further inspiration:

courses/be5b33kui/labs/weekly/week_09.txt · Last modified: 2021/04/26 10:47 by gamafili