Spam filter: An introduction

Before the first lecture about spam filter, try to think about and answer those questions:

  • What do you think is spam? How would you define it?
  • Is the definition of spam the same for everyone? Would everyone label the same emails as spam? Would a person label the same set of emails as spam now and after five years?
  • What is a spam filter according to you? What inputs does it have and what outputs?
  • What are the attributes of spam? How can you decide whether an email is a spam or normal email?
  • Can one spam filter be used for everyone? How can a spam filter keep its efficiency over time? How can a spam filter work for different people?
  • Can a spam filter commit errors? What type of errors? When is one filter better than another? Are all errors equally serious, or are there some more serious than others?
