This page is located in archive.


This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
courses:be5b33prg:homeworks:spam:step2 [2015/11/25 16:33]
xposik [Preparation]
courses:be5b33prg:homeworks:spam:step2 [2015/12/04 14:22]
svobodat [Specifications]
Line 22: Line 22:
 from collections import namedtuple from collections import namedtuple
-ConfMat = namedtuple('​ConfMat',​ 'tptn fp fn')+ConfMat = namedtuple('​ConfMat',​ 'tp tn fp fn')
 </​code>​ </​code>​
 Why do we need it? Why do we need it?
   * Function ''​compute_confusion_matrix()''​ represents the basis for evaluation of the filter performance.   * Function ''​compute_confusion_matrix()''​ represents the basis for evaluation of the filter performance.
-  * The function can be used in the following way:<code python>+ 
 +The function can be used in the following way. First, an example where both the input dictionaries are empty, i.e. we have no information about any email. 
 +<code python>
 >>>​ cm1 = compute_confusion_matrix({},​ {}) >>>​ cm1 = compute_confusion_matrix({},​ {})
 >>>​ print(cm1) >>>​ print(cm1)
 ConfMat(tp=0,​ tn=0, fp=0, fn=0) ConfMat(tp=0,​ tn=0, fp=0, fn=0)
-</​code>​or<code python>+</​code>​ 
 +In the following code, each of TP, TN, FP, FN cases happens exactly once: 
 +<code python>
 >>>​ truth_dict = {'​em1':​ '​SPAM',​ '​em2':​ '​SPAM',​ '​em3':​ '​OK',​ '​em4':'​OK'​} >>>​ truth_dict = {'​em1':​ '​SPAM',​ '​em2':​ '​SPAM',​ '​em3':​ '​OK',​ '​em4':'​OK'​}
 >>>​ pred_dict = {'​em1':​ '​SPAM',​ '​em2':​ '​OK',​ '​em3':​ '​OK',​ '​em4':'​SPAM'​} >>>​ pred_dict = {'​em1':​ '​SPAM',​ '​em2':​ '​OK',​ '​em3':​ '​OK',​ '​em4':'​SPAM'​}
Line 39: Line 45:
 </​code>​ </​code>​
-**Note**: You can expect ​that the dictionaries ​will have the same set of keysThink about the situation when the keys would be different: what shall the method do?+And in the last example, the predictions perfectly match the real classes, such that only TP and TN are nonzero: 
 +<code python>​ 
 +>>>​ truth_dict = {'​em1':​ '​SPAM',​ '​em2':​ '​SPAM',​ '​em3':​ '​OK',​ '​em4':'​OK'​} 
 +>>>​ pred_dict = {'​em1':​ '​SPAM',​ '​em2':​ '​SPAM',​ '​em3':​ '​OK',​ '​em4':'​OK'​} 
 +>>>​ cm2 = compute_confusion_matrix(truth_dict,​ pred_dict, pos_tag='​SPAM',​ neg_tag='​OK'​) 
 +>>>​ print(cm2) 
 +ConfMat(tp=2,​ tn=2, fp=0, fn=0) 
 +Of course, ​the input dictionaries ​may have a different number ​of items than 4. 
 >​{{page>​courses:​be5b33prg:​internal:​homeworks:​spam:​step2#​compute_confusion_matrix&​editbtn}} >​{{page>​courses:​be5b33prg:​internal:​homeworks:​spam:​step2#​compute_confusion_matrix&​editbtn}}
courses/be5b33prg/homeworks/spam/step2.txt · Last modified: 2015/12/04 14:22 by svobodat