RL: Scoring

The evaluation is composed of:

  1. Automatic evaluation tests the performance of your agent in 5 environments. We run the strategy found by your agent for a given environment n-times and calculate the average sum of rewards collected by it. We then compare this with the teacher's solution (an agent executing the optimal strategy). For each testing environment, if your strategy reaches at least 80% of the teacher's value of the sum of rewards, you earn one point.
  2. Manual evaluation is based on code quality(clean code).
Evaluated Performance min max note
Quality of RL algorithm 0 5 Evaluation of the algorithm by an automatic evaluation system.
Quality of code 0 1 Comments, structure, elegance, cleanliness of code, appropriate naming of variables…

Quality of code (1 point):

  • appropriate comments, or the code is understandable enough that it does not need comments
  • reasonably long, or rather short methods/functions
  • variable names (nouns) and function/method names (verbs) help readability and understandability
  • pieces of code do not repeat (no copy-paste)
  • reasonable use of memory and processor time
  • consistent style of code throughout the entire file
  • clear structure of code (avoid, for example, unpythonic assignment of many variables in one line)

You can follow the PEP8 intended for Python. Most editors (certainly PyCharm) themselves point out deficiencies with regard to PEP8. You can also get inspired, for example, here or read about idiomatic Python on medium.

courses/be5b33kui/semtasks/04_rl/scoring.txt · Last modified: 2024/04/16 13:51 by xposik