Search
This is an old revision of the document!
Create function compute_confusion_matrix() that will compute and return a confusion matrix based on real classes of emails, and on email classes predicted by a filter.
compute_confusion_matrix()
namedtuple
Task:
quality.py
truth_dict
pred_dict
pos_tag
True
neg_tag
False
pos_tag=“SPAM”
neg_tag=“OK”
from collections import namedtuple ConfMat = namedtuple('ConfMat', 'tp, tn fp fn')
Why do we need it?
>>> cm1 = compute_confusion_matrix({}, {}) >>> print(cm1) ConfMat(tp=0, tn=0, fp=0, fn=0)
>>> truth_dict = {'em1': 'SPAM', 'em2': 'SPAM', 'em3': 'OK', 'em4':'OK'} >>> pred_dict = {'em1': 'SPAM', 'em2': 'OK', 'em3': 'OK', 'em4':'SPAM'} >>> cm2 = compute_confusion_matrix(truth_dict, pred_dict, pos_tag='SPAM', neg_tag='OK') >>> print(cm2) ConfMat(tp=1, tn=1, fp=1, fn=1)
Note: You can expect that the dictionaries will have the same set of keys. Think about the situation when the keys would be different: what shall the method do?