10 Reinforcement Learning III

How does Q-learning work for environments with continuous states/actions? Beyond the course contents.

Learning outcomes

After this practice session, the student

  • knows about linear regression as a possible tool to model V- and Q-functions;
  • understands in principle how approximative Q-learning works.

Program

  • Q/A
  • Discussion of the bonus quiz from the last week
  • Exercise 1: Approximation minimizing least squares error (LSQ)
  • Exercise 2: Approximative Q-learning
  • Introduction of the bonus quiz for this week

Exercise / Solving together

  • Approximation minimizing least squares error (LSQ)
  • Approximative Q-learning

Bonus quiz

  • Calculate state values during a random walk policy
  • 0.5 points
  • submit your solution to BRUTE lab10quiz, deadline in BRUTE
  • format: text file, photo of your solution on paper, pdf - what is convenient for you
  • solution will be discussed on the next lab
  • Students with their family name starting from A to K (included) have to solve and upload subject A , while students with family name from L to Z have to solve and upload subject B.

Homework

courses/be5b33kui/labs/weekly/week_10.txt · Last modified: 2026/04/27 10:30 by xposik