Search
Create class TrainingCorpus by deriving it from class Corpus. The class shall encapsulate a corpus with known true classification of the email messages, i.e. it shall represent a corpus usable for filter training.
TrainingCorpus
Corpus
Tests for step 5:
Class TrainingCorpus is not obligatory and it implementation is not fixed. You can implement only those methods that you find useful. The provided tests target all the below mentioned methods; if you decide not to implement them all, then delete (or comment out) the related tests in class TrainingCorpusClass.
TrainingCorpusClass
By now, you should know everything you need to successfully implement the TrainingCorpus class. The only remaining thing to prepare:
Task:
trainingcorpus.py
Why do we need it?
!truth.txt
Specifications for this class are not fixed, it is up to you to decide what methods you need. The following methods can serve as an inspiration (and the test provided for this class assume the existence of these methods):
get_class(filename)
filename
is_ham(filename)
is_spam(filename)
True
False
spams()
hams()
emails()
It is entirely up to you if you want to implement any of these methods.