Table of Contents

Spam filter - step 5

Create class TrainingCorpus by deriving it from class Corpus. The class shall encapsulate a corpus with known true classification of the email messages, i.e. it shall represent a corpus usable for filter training.

Tests for step 5:

Class TrainingCorpus is not obligatory and it implementation is not fixed. You can implement only those methods that you find useful. The provided tests target all the below mentioned methods; if you decide not to implement them all, then delete (or comment out) the related tests in class TrainingCorpusClass.

Preparation

By now, you should know everything you need to successfully implement the TrainingCorpus class. The only remaining thing to prepare:

Training data corpus

Task:

Why do we need it?

Specifications

Specifications for this class are not fixed, it is up to you to decide what methods you need. The following methods can serve as an inspiration (and the test provided for this class assume the existence of these methods):

It is entirely up to you if you want to implement any of these methods.