xp36vpd -- Selected parts of data mining

Data mining aims at revealing non-trivial, hidden and ultimately applicable knowledge in large data. This course focuses on two key data mining issues: data size and their heterogeneity. When dealing with large data, it is important to resolve both the technical issues such as distributed computing or hashing and general algorithmic complexity. In this part, the course will be motivated mainly by case studies on web and social network mining. The second part will discuss approaches that merge heterogeneous prior knowledge with measured data. Bioinformatics will make the main application field here. It is assumed that students have completed the master course on Machine Learning and Data Analysis (A4M33SAD).

The course will take a form of reading and discussion group. Each student gives two 1 hour lectures, followed by a 30 min discussion. One of the lectures shall be DM general (MMDS book chapters, recent tutorials at major ML/DM conferences, etc.), the second one can present your research (if DM related) or a DM topic that is closely related to your research or research interests.

Go beyond the literature, provide own insight, offer own illustrative examples, etc.

Fall 2017

Meetings every Friday at 11:00 in 205. NOT as in the official schedule!

L	Date	Presents	Contents	Materials
1	Oct 6	JK, FZ	Course overview, introduction, research interests.
2	Oct 13	Karel Horák	Learning to rank	learningtorank.pdf
	Oct 13	Martin Svatoš	Relational frequent pattern mining	vpd_frpm.pdf
3	Oct 20	Petr Ryšavý	Clustering of biological sequences	bioclustering.pdf
	Oct 20	Jan Mrkos	Multicriterial learning (clustering)	multicriterial.pdf
4	Oct 27	Petr Váňa	Reinforcement learning in robotics	rl-vana.pdf
	Oct 27	Martin Matyášek	(Deep) reinforcement learning	deep-rl.pdf
5	Nov 3	Martin Matyášek	AlphaGO, a deep RL application in games	deep-rl-all.pdf
	Nov 3	Petr Čížek	Managing and mining (streaming) sensor data	stream_mining_cizek.pdf
6	Nov 10	Jáchym Barvínek	Learning structure-activity models	qsar.pdf
	Nov 10	Jan Šimbera	Non linear dimensionality reduction	dimred.pdf
7	Nov 17		Holiday
8	Nov 24	Filip Paulů	Data mining applications in manufacturing	cancelled
	Nov 24	Marek Cuchý	Mining social networks
9	Dec 1	Karel Horák	Deep learning: methods and applications	deeplearning.pdf
	Dec 1	Martin Svatoš	Rule extraction from neural networks	vpd_ruleextraction.pdf
10	Dec 8	Petr Ryšavý	Recommender systems	recommendersystems.pdf
	Dec 8	Petr Čížek	Learning from various sensory modalities	sensory.pdf
11	Dec 15	Jan Mrkos	Non-convex optimization in machine learning	non-convex.pdf
12	Jan 5	Jáchym Barvínek	Abductive logic programming for metabolic network learning	abductive.pdf
	Jan 5	Jan Šimbera	Machine learning in geography	geo.pdf
13	Jan 12	Marek Cuchý	Advertising on the Web
	Jan 12	Petr Váňa	Reinforcement learning in robotics, part II
	Jan 12	JK, FZ	zkouška

References

Rajaraman, A., Leskovec, J., Ullman, J. D.: Mining of Massive Datasets, Cambridge University Press, 2011.
Free Data mining Books
Recent tutorials, major ML/DM conferences: KDD2014, DS2014, ICML16, ECML16
Review papers: Yang, Wu: 10 CHALLENGING PROBLEMS IN DATA MINING RESEARCH
External seminars: ML seminars at MFF, PIS, Machine Learning Meetups.

Links

Evaluation, requirements

every student must give his talks (the principle requirement in this type of course),
attendance and active discussion at presentations of other students,
pass the exam, i.e., prove the knowledge of basic concepts presented during the course.

Table of Contents

xp36vpd -- Selected parts of data mining

Fall 2017

References

Links

Evaluation, requirements