====== Assignment: Histogram Classifier ======
**📅 Deadline:** 20.11.2024 21:59
**🏦 Points:** 4
===== Task Description =====
In this assignment, you are tasked with training a histogram classifier, computing bounds on its generalization error, and computing bounds on its estimation error. You can find the complete description of the assignment in the {{ :courses:be4m33ssu:homeworks:hw_assignment_hist_cls.pdf | Assignment PDF}}.
You are provided with a {{ :courses:be4m33ssu:homeworks:hw_histogram_classifier_template_23_10_2024.zip |template}} containing the following files:
* **main.py**: This file includes the functions **learn_classifier**, **generalization_bound**, **estimation_error_bound** that you are required to implement.
* **utils.py**: Contains helper functions for loading and saving data. You do not need to modify this file.
* **test-cases**: A folder containing public test cases to help you verify your implementation before submitting to [[https://cw.felk.cvut.cz/brute/student/course/BE4M33SSU/histclas|BRUTE]].
Your objective is to implement the functions **learn_classifier**, **generalization_bound**, **estimation_error_bound** in **main.py**.
All python files must be stored in the root of the .zip sent for submission.
**You may encounter issues in the private instance 4 test case. The issue is most likely with computing |H| or log(|H|). The number |H| is so large that it overflows any precision supported by numpy. Hint: logarithm of a product is a sum of logarithms.**
===== How to Test =====
After completing your implementation, you can test your solution using the following commands before submitting it to BRUTE:
----
== Test Case 1 ==
python main.py test-cases/public/instances/instance_1.json --plot
Expected output:
The trained histogram classifier achieves true error of at most 0.833 with probability at least 0.95
The trained histogram classifier achieves true error that differs from the best histogram classifier by at most 0.965 with probability at least 0.95
Comparing with reference solution
learn_classifier: Test OK
generalization_bound: Test OK
estimation_error_bound: Test OK
----
== Test Case 2 ==
python main.py test-cases/public/instances/instance_2.json --plot
Expected output:
The trained histogram classifier achieves true error of at most 0.484 with probability at least 0.9
The trained histogram classifier achieves true error that differs from the best histogram classifier by at most 0.276 with probability at least 0.9
Comparing with reference solution
learn_classifier: Test OK
generalization_bound: Test OK
estimation_error_bound: Test OK
----
== Test Case 3 ==
python main.py test-cases/public/instances/instance_3.json --plot
Expected output:
The trained histogram classifier achieves true error of at most 1.828 with probability at least 0.9
The trained histogram classifier achieves true error that differs from the best histogram classifier by at most 1.898 with probability at least 0.9
Comparing with reference solution
learn_classifier: Test OK
generalization_bound: Test OK
estimation_error_bound: Test OK
== Test Case 4 ==
python main.py test-cases/public/instances/instance_4.json --plot
Expected output:
The trained histogram classifier achieves true error of at most 0.65 with probability at least 0.9
The trained histogram classifier achieves true error that differs from the best histogram classifier by at most 0.397 with probability at least 0.9
Comparing with reference solution
learn_classifier: Test OK
generalization_bound: Test OK
estimation_error_bound: Test OK
You may encounter issues with numerical errors in this test case. You should find a way to avoid them.
===== Submission Guidelines =====
* Submit the completed code as a .zip via BRUTE.
* All python files must be stored in the root of the .zip sent for submission.
* Make sure your implementation passes the test cases provided above. Good luck! 😊