====== The Minimax Task ======

The Bayesian formulation has some limits discussed at the {{:courses:be5b33rpz:lessons:p2rod-nonbayes.pdf|lecture}} and for certain situation one may need to formulate the problem e.g. as a Neyman-Pearson task, Wald task or minimax task. Out of this spectrum of problem formulations we will examine the minimax task.

To continue with our simple OCR developed in the previous labs, lets consider a situation when we are trying to recognise licence plates. For simplicity we again stick to the simple two-letter alphabet and 10x10 already segmented letters. The situation in the Bayes task then corresponds to a problem of designing a recognition system for a particular crossroad where the a priori probability of the two letters is known or could be measured.

In today's assignment we aim at designing a recognition system which could be deployed anywhere in the Czech republic and we need to take into account that licence plates in different cities use different combination of the letters, so the a priori probabilities differ as well. Further, lets assume it is not practical to find the optimal Bayesian strategy for each used camera.

As you might have already guessed, the situation is ideal for the minimax strategy. Here, the **a priori probability exists, but is not known**. We will first study the task rather theoretically to demonstrate some principles from the lecture and then we apply the gained knowledge to the real task of recognising the letters on the licence plates.

<WRAP center round info>

[[https://cw.fel.cvut.cz/b191/courses/be5b33rpz/labs/python_development|General information for Python development]].

To fulfil this assignment, you need to submit these files (all packed in one ''.zip'' file) into the [[https://cw.felk.cvut.cz/brute/|upload system]]:
  * **''answers.txt''** - answers to the Assignment Questions
  * **''minimax.ipynb''** - a notebook for data initialisation, calling of the implemented functions and plotting of their results (for your convenience, will not be checked).
  * **''minimax.py''** - file with the following methods implemented:
    * **''risk_fix_q_discrete''**, **''risk_fix_q_cont''** - functions for computing risk for a fixed strategy but changing a priori probability
    * **''worst_risk_discrete''**, **''worst_risk_cont''** - worst case risk for a Bayesian strategy trained with a particular a priori probability but evaluated on all a priori probabilities
    * **''minmax_strategy_discrete''**, **''minmax_strategy_cont''** - functions which find the optimal minmax strategy
    * all **functions from the previous assignment (bayes)** - Unfortunately, including bayes.py and import from it is not supported. You have to copy your functions from bayes.py to minimax.py.
  * **''plots_discrete.png''**, **''plots_cont.png''**, **''minmax_classif_cont.png''** and **''minmax_classif_discrete.png''** - images specified in the tasks

** Use [[https://cw.fel.cvut.cz/b191/courses/be5b33rpz/labs/python_development#Assignment Templates|template]] of the assignment.** When preparing a zip file for the upload system, **do not include any directories**, the files have to be in the zip file root.
<note warning>It is not possible to import your previously implemented functions from ''bayes.py'' in BRUTE, please copy all the necessary functions directly into ''minimax.py''. Sorry for the inconvenience.</note>
</WRAP>
===== The Tasks =====

As in the last labs, use the discrete and continuous measurements $x$ computed by ''compute_measurement_lr_discrete'' and ''compute_measurement_lr_cont''.

For the experiments, choose two letters corresponding to your initials and use the distribution parameters given in the following variables (assuming your name is Chuck Norris) for the discrete measurements tasks: <WRAP>
^  variable  ^  description  ^
| ''discreteC['Prob']'' | $p_{X|k}(x|C)$ given as a (21, ) numpy array (the size corresponds to the range of the values of $x$) |
| ''discreteN['Prob']'' | $p_{X|k}(x|N)$ given as a (21, ) numpy array (the size corresponds to the range of the values of $x$) |
</WRAP>
and the following ones for the cases with continuous measurements: <WRAP>
^  variable  ^  description  ^
| ''contC['Mean']'' | mean value of the normal distribution $p_{X|k}(x|C)$ |
| ''contC['Sigma']'' | standard deviation of the normal distribution $p_{X|k}(x|C)$ |
| ''contN['Mean']'' | mean value of the normal distribution $p_{X|k}(x|N)$ |
| ''contN['Sigma']'' | standard deviation of the normal distribution $p_{X|k}(x|N)$ |
</WRAP>
(still assuming you are Chuck Norris)
  - Use the discrete distributions and for $p_K(C) \in \{0, 0.01, \ldots, 0.99, 1\}$ compute and plot into one figure the following:
      - Risk of the Bayesian strategy optimal for each $p_K(C)$.\\ \\ **Hint 1:** Use your code from the last labs.\\ \\ **Hint 2:** In the minimax problem formulation, each error is penalised equally independent of the class (this corresponds to the 0-1 loss function).\\ \\
      - Find the optimal strategy for $p_K(C) = 0.25$. Then for this, now fixed strategy, alter the a priori probability and compute its Bayesian risk. For this complete the template function ''risk_fix_q_discrete''.\\ \\ **Hint 1:** This is what happens, when we have assumed wrong a priori probabilities different from the real one. \\ **Hint 2:** Equation (3) from {{:courses:be5b33rpz:labs:03_minimax:minimax_task.pdf|[1]}} explains the shape of the observed curve.\\ \\
      - For each iterated $p_K(C)$ derive the optimal Bayesian strategy and compute its worst-case risk in case the true a priori probability is different. For this complete the template function ''worst_risk_discrete''.\\ \\ **Hint:** You can again use the equation (3) from {{:courses:be5b33rpz:labs:03_minimax:minimax_task.pdf|[1]}} and the fact that the a priori probability is bounded to the <0,1> interval.\\ \\
  - Save the figure to **''plots_discrete.png''**. \\ \\ {{:courses:be5b33rpz:labs:03_minimax:plots_discrete_md.png?600|}} \\ Assuming your name is Mirek Dušín. Chuck Norris initials refused to be used in this graph. \\ \\
  - Repeat the steps 1 and 2 for the continuous distributions and save the figure to **''plots_cont.png''**. The implemented functions are the same but with ''_cont'' instead of ''_discrete'' suffix.\\ \\ {{:courses:be5b33rpz:labs:03_minimax:plots_cont_md.png?600|}} \\ Assuming your name is Mirek Dušín. Chuck Norris initials refused to be used in this graph. \\ \\
  - Complete the template functions ''minmax_strategy_discrete'' and ''minmax_strategy_cont'', so that they find the minmax strategy and its risk for discrete and continuous measurements respectively. **Hint:** The minimax strategy is the Bayesian strategy with minimal maximal Bayesian risk over all $p_K(C)$.\\ \\ **Hint**: use ''scipy.optimize.fminbound''\\ \\ 

<code python>
D1 = discrete['C']
D2 = discrete['N']
q, risk = minmax_strategy_discrete(D1, D2)

# q -> array([0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0])
# risk -> 0.02857142873108387

D1 = cont['C']
D2 = cont['N']
q, risk = minmax_strategy_cont(D1, D2)

# q['t1'] -> -775.9750275387074
# q['t2'] -> 10080.038857544445
# q['decision'] -> array([0, 1, 0], dtype=int32)
# risk -> 0.013218704611063004
</code>
  - Plot bayesian risk for all values of $p_K(C)$ for the minimax strategy in both discrete and continuos cases.\\ \\
  - Generate a test set for your initials using **''create_test_set''**.
  - Compute the error of the strategy on the generated test images, show the final classification in one figure and save it as **''minmax_classif_cont.png''** and **''minmax_classif_discrete.png''**. \\ \\ **Hint**: use the ''**classification_error_discrete**'' and ''**classification_error_2normal**'' functions from the bayes assignment.
 
<code python>
error_discrete = classification_error_discrete( images_test_set, labels_test_set, q_minimax_discrete)
# error_discrete -> 0.0667

error_cont = classification_error_2normal( images_test_set, labels_test_set, q_minimax_cont)
# error_cont -> 0
</code> 

{{:courses:be5b33rpz:labs:03_minimax:minimax_classification_discrete_cn.png?500|}} {{:courses:be5b33rpz:labs:03_minimax:minimax_classification_cont_cn.png?500|}}
===== Assignment Questions =====

Fill the correct answers to your ''answers.txt'' file.
<WRAP center round help>
  - What is the maximal risk of the optimal Bayesian strategy trained for $P_K(A)=0.3$ and $P_K(B)=0.7$ using continuous measurement when the a priori probabilities are allowed to change? Round it to two decimal points.
  - The relation between the worst case risk and the best case risk for the minimax strategy in the continuous case is
    * a) The worst case risk is always greater than the best case risk
    * b) The worst case risk could be smaller than the best case risk
    * c) The worst case risk is strictly equal to the best case risk
  - In the above we assumed the a priori probability exists but is not known. What would change if it didn't exist?
    * a) We can't compute the risk and have to use the likelihood ratio directly
    * b) We would need to use different loss function W
    * c) We would proceed the same way as we did above
</WRAP>

===== Bonus Task =====

**This task is not compulsory.** 

Go through the chapter on non-Bayesian tasks in SH10 book [2], especially the parts discussing solution of the minimax task by linear programming (pages 25, 31, 35-38, 40-41, 46-47). Solve the above classification task using linear programming as described on page 47.

**Hints:**
  * Work with the discrete measurements $x$.
  * Represent the sought strategy (classification function) as a table α, with α(i,k) corresponding to the probability of classification of bin 'i' to class 'k' such that α(i,k) >= 0 and sum_k α(i,k) = 1.
  * Reformulate the task in a matrix form according to equation 2.53 in SH10, page 47
  * Solve the task numerically using ''scipy.optimize.linprog''.
  * Compare the obtained results with the results of the classification above.

===== References =====

  * [1] {{:courses:be5b33rpz:labs:03_minimax:minimax_task.pdf| Minimax Task}} (short support text for labs)
  * [2] Michail I. Schlesinger, Vaclav Hlavac. Ten Lectures on Statistical and Structural Pattern Recognition. Kluwer Academic Publishers, 2002.
  * [3] [[http://cmp.felk.cvut.cz/cmp/courses/recognition/slides/non-bayesian_recognition/thumb.html|slides of prof. Matas]]