Differences

This shows you the differences between two versions of the page.

--- courses:be5b33prg:homeworks:spam:step2 [2015/11/25 16:33]
xposik [Preparation]
+++ courses:be5b33prg:homeworks:spam:step2 [2015/12/01 16:04]
xposik [Specifications]
@@ Line 27: / Line 27: @@
 Why do we need it?
   * Function ''compute_confusion_matrix()'' represents the basis for evaluation of the filter performance.
-  * The function can be used in the following way:<code python>
+The function can be used in the following way. First, an example where both the input dictionaries are empty, i.e. we have no information about any email.
+<code python>
 >>> cm1 = compute_confusion_matrix({}, {})
 >>> print(cm1)
 ConfMat(tp=0, tn=0, fp=0, fn=0)
-</code>or<code python>
+</code>
+In the following code, each of TP, TN, FP, FN cases happens exactly once:
+<code python>
 >>> truth_dict = {'em1': 'SPAM', 'em2': 'SPAM', 'em3': 'OK', 'em4':'OK'}
 >>> pred_dict = {'em1': 'SPAM', 'em2': 'OK', 'em3': 'OK', 'em4':'SPAM'}
@@ Line 39: / Line 45: @@
 </code>
-**Note**: You can expect that the dictionaries will have the same set of keys. Think about the situation when the keys would be different: what shall the method do?
+And in the last example, the predictions perfectly match the real classes, such that only TP and TN are nonzero:
+<code python>
+>>> truth_dict = {'em1': 'SPAM', 'em2': 'SPAM', 'em3': 'OK', 'em4':'OK'}
+>>> pred_dict = {'em1': 'SPAM', 'em2': 'SPAM', 'em3': 'OK', 'em4':'OK'}
+>>> cm2 = compute_confusion_matrix(truth_dict, pred_dict, pos_tag='SPAM', neg_tag='OK')
+>>> print(cm2)
+ConfMat(tp=2, tn=2, fp=0, fn=0)
+</code>
+Of course, the input dictionaries may have a different number of items than 4.
 >{{page>courses:be5b33prg:internal:homeworks:spam:step2#compute_confusion_matrix&editbtn}}