Search
This is an old revision of the document!
During this assignment you will work with data sets of emails, which will also contain meta-data about the emails. Such a set of data is usually called corpus. In our case, the meta-data for our emails may contain the information whether it is a spam or not and/or what the decision of the spam filter is.
You are given two sets of data to work with, they both come from the same source.
So, our email corpus will be:
Of course, these two files do not have to be present in the corpus directory: