Scoring

The evaluation is divided as follows:

Automatic evaluation tests the correctness of the found strategy (matching your returned actions with optimal actions for all states) in several different environments (possibly with different discount factors).
Manual evaluation is based on code evaluation (clean code).

Evaluated performance	max	note
Quality of value iteration algorithm	2.5	Evaluation of the algorithm by an automatic evaluation system.
Quality of policy iteration algorithm	2.5	Evaluation of the algorithm by an automatic evaluation system.
Code quality	1	Comments, structure, elegance, code cleanliness, appropriate variable naming…

Automatic evaluation:

policy match 95% and more (average on n tested mazes): 2.5 points
policy match 90%-95% : 2 points
policy match 85%-90% : 1.5 points
policy match 80%-85% : 1 point
policy match 70%-80% : 0.5 points
less than 70% match: 0 points

Code quality (1 point):

appropriate comments, or the code is understandable enough that it does not need comments
reasonably long (or rather short) methods/functions
variable names (substantive names) and functions (verbs) help readability and understandability
pieces of code do not repeat (no copy-paste)
reasonable memory and processor time saving
consistent names and code layout throughout the file (separate words in all methods in the same way, etc.)
clear code structure (avoid, for example, unpythonic assignment of many variables in one line)
…

You can follow the PEP8 (Python style guide). Most editors (including PyCharm and VS Code) warn about PEP8 deficiencies themselves. You can also get inspired e.g. here or read about idiomatic python on medium or at elsewhere.