Search
Finish the homework on files and submit it to the upload system. Deadline is tonight 23:59!
Work on the spam filter task. Submit your solution according the specifications into upload system. Deadline is Dec 6 2019!
utils.py
def read_classification_from_file(fpath)
fpath
def read_classification_from_file(fpath): """Return a dictionary with email classification :param fpath: string, path to a text file !truth.txt or !prediction.txt :return: dictionary, keys are email filenames, values are their classicifications """
def write_classification_to_file(cls_dict, fpath)
cls_dict
read_classification_from_file()
quality.py
def compute_confusion_matrix(truth_dict, pred_dict, pos_tag = True, neg_tag = False)
truth_dict
pred_dict
This function has two extra parameters pos_tag and neg_tag. They specify how positive and negative cases in the input dictionaries are coded. Typically pos_tag = “SPAM” and neg_tag = “OK”. Output of this function is a namedtuple containing tp, tn, fn, fp. For more information and a few test cases see Spam filter - step 2.
pos_tag
neg_tag
pos_tag = “SPAM”
neg_tag = “OK”
namedtuple
quality_score(tp, tn, fp, fn)
compute_quality_for_corpus(corpus_dir)
corpus_dir