Search
evaluator.py
agent.msh
This task extends the previous task; hence, students are advised to familiarize themself with the T4a-rl - Reinforcement Learning task first as all of the information from the previous task is valid for this task as well.
This task further evaluates the trained gait by deploying the trained gait to a real inchworm robot or by further evaluation in the BRUTE system.
The project files (evaluator.py and agent.msh) are submitted to the BRUTE system before the student's exam, when the automatic evaluation assigns additional points. The deployment will be possible during the project submission, when the experimental setup, including the required packages and hardware, is prepared for the student's convenience on designated PCs.
Before the deployment, the submitted agent.msh is downloaded from BRUTE to a designated PC. The gait is run for 30 on the real robot, starting in the most prolonged pose.
The median of three runs is used to produce the final value score. A run where the robot loses stability or fails in another way can be repeated, up to three repetitions in total.
Aside from the evaluation using the real robot, the trained agent's policy is run for 30 seconds by the evaluation system. The distance travelled by a backmost servomotor (servo-3) is measured and averaged over the last ten simulation steps to determine the distance travelled by the robot. The BRUTE evaluates each run as follows.
servo-3
These points are summed up, and up to 5 points are assigned for the simulation run. Ten simulation runs are executed, and the median of the points achieved is assigned as the final score.
Then, the points assigned for deployment and the extended evaluation in BRUTE are summed up, and up to 5 five points are assigned as the final score.
For a real deployment, two additional criteria should be considered. Firstly, the proposed gait should be able to disengage the scales when moving forward and reengage them when staying in place. Secondly, the proposed gait should not unnecessarily lift the centre of mass as it makes a robot more prone to losing balance.