CourseWare Wiki
Switch Term
Winter 2024 / 2025
Winter 2023 / 2024
Winter 2022 / 2023
Winter 2021 / 2022
Winter 2020 / 2021
Winter 2019 / 2020
Winter 2018 / 2019
Older
Search
Log In
old
courses
be5b33prg
homeworks
spam
Warning
This page is located in archive. Go to the latest version of this
course pages
.
Differences
This shows you the differences between two versions of the page.
View differences:
Side by Side
Inline
Go
Link to this comparison view
Both sides previous revision
Previous revision
2015/11/24 16:02 xposik [Data]
2015/11/24 15:58 xposik [The problem]
2015/10/13 14:24 xposik [Spam filter]
2015/10/13 14:22 xposik File moved, internal links changed to absolute form
2015/10/05 15:38 xposik [Spam filter]
2015/10/05 15:01 xposik
2015/10/05 15:00 xposik [Objectives]
2015/10/05 14:59 xposik [Objectives]
2015/10/05 14:58 xposik [What will you learn?]
2015/10/05 14:57 xposik [Spam filter]
2015/10/05 14:56 xposik [Spam filter]
2015/10/05 14:56 xposik created
Go
Next revision
Previous revision
2015/11/24 16:02 xposik [Data]
2015/11/24 15:58 xposik [The problem]
2015/10/13 14:24 xposik [Spam filter]
2015/10/13 14:22 xposik File moved, internal links changed to absolute form
2015/10/05 15:38 xposik [Spam filter]
2015/10/05 15:01 xposik
2015/10/05 15:00 xposik [Objectives]
2015/10/05 14:59 xposik [Objectives]
2015/10/05 14:58 xposik [What will you learn?]
2015/10/05 14:57 xposik [Spam filter]
2015/10/05 14:56 xposik [Spam filter]
2015/10/05 14:56 xposik created
Go
courses:be5b33prg:homeworks:spam:start [2015/10/13 14:22]
xposik
File moved, internal links changed to absolute form
courses:be5b33prg:homeworks:spam:start [2015/11/24 16:02]
xposik
[Data]
Line 1:
Line 1:
======Spam filter======
======Spam filter======
Spam filtering is a very practical assignment with a large real world application. It also represents certain class of problems, we have to contend with in machine learning.
Spam filtering is a very practical assignment with a large real world application. It also represents certain class of problems, we have to contend with in machine learning.
-
*[[courses:be5b33prg
:internal
:homeworks:spam:introduction|An introduction to the problem of spam filtering]]
+
*[[courses:be5b33prg:homeworks:spam:introduction|An introduction to the problem of spam filtering]]
-
*[[courses:be5b33prg
:internal
:homeworks:spam:evaluation|Evaluation]]
+
*[[courses:be5b33prg:homeworks:spam:evaluation|Evaluation]]
=====The problem=====
=====The problem=====
In this assignment, your main task is not to create a perfect spam filter. You do not know the methods that would allow you to do that yet. Your task is:
In this assignment, your main task is not to create a perfect spam filter. You do not know the methods that would allow you to do that yet. Your task is:
-
* To understand the problem, analyze the assignment
a
decompose it.
+
* To understand the problem, analyze the assignment
and
decompose it.
-
* To create a set of functions and
objects
in Python, which would help you to use a spam filter (once you create one) and evaluate its quality (compare two spam filters).
+
* To create a set of functions and
classes
in Python, which would help you to use a spam filter (once you create one) and evaluate its quality (compare two spam filters).
* To create a simple (even a very trivial) spam filter, which could be used in such a framework.
* To create a simple (even a very trivial) spam filter, which could be used in such a framework.
Line 22:
Line 22:
- There exists a kind of tasks, where it is hard to judge the quality of a solution.
- There exists a kind of tasks, where it is hard to judge the quality of a solution.
-
=====Data=====
+
===== Data =====
-
You are given two
sets of data to work with. While the final evaluation of your work will be done using different set of data, your spam filter should work on both.
+
We provide you with [[courses:be5b33prg:homeworks:spam:data|2
sets of data
]]
to work with. While the final evaluation of your work will be done using different set of data, your spam filter should work on both
. It is also important that you understand the format of the data that we will use; it is described on the page linked above
.
-
<WRAP round download>
-
{{filelist>:courses:a4b99rph:cviceni:files:spam-data-12-s75-h25.zip&style=table&tableheader=1&tableshowdate=1&tableshowsize=1}}
-
</WRAP>
courses/be5b33prg/homeworks/spam/start.txt
· Last modified: 2015/11/24 16:02 by
xposik