====== xp36vpd -- Selected parts of data mining ====== **Data mining** aims at revealing non-trivial, hidden and ultimately applicable knowledge in large data. This course focuses on two key data mining issues: data size and their heterogeneity. When dealing with large data, it is important to resolve both the technical issues such as distributed computing or hashing and general algorithmic complexity. In this part, the course will be motivated mainly by case studies on web and social network mining. The second part will discuss approaches that merge heterogeneous prior knowledge with measured data. Bioinformatics will make the main application field here. It is assumed that students have completed the master course on Machine Learning and Data Analysis (A4M33SAD). The course will take a form of **reading and discussion group**. Each student gives two 1 hour lectures, followed by a 30 min discussion. One of the lectures shall be DM general (MMDS book chapters, recent tutorials at major ML/DM conferences, etc.), the second one can present your research (if DM related) or a DM topic that is closely related to your research or research interests. Go beyond the literature, provide own insight, offer own illustrative examples, etc. ===== Fall 2017 ===== Meetings every Friday at 11:00 in 205. NOT as in the official schedule! ^ L ^ Date^ Presents ^ Contents ^ Materials ^ | 1 | Oct 6 | JK, FZ | Course overview, introduction, research interests. | | | 2 | Oct 13 | Karel Horák | Learning to rank | {{:courses:xp36vpd:learningtorank.pdf|}} | | | Oct 13 | Martin Svatoš | Relational frequent pattern mining | {{:courses:xp36vpd:vpd_frpm.pdf|}} | | 3 | Oct 20 | Petr Ryšavý | Clustering of biological sequences| {{:courses:xp36vpd:bioclustering.pdf|}} | | | Oct 20 | Jan Mrkos | Multicriterial learning (clustering) | {{:courses:xp36vpd:multicriterial.pdf|}} | | 4 | Oct 27 | Petr Váňa | Reinforcement learning in robotics | {{:courses:xp36vpd:rl-vana.pdf|}} | | | Oct 27 | Martin Matyášek | (Deep) reinforcement learning | {{:courses:xp36vpd:deep-rl.pdf|}} | | 5 | Nov 3 | Martin Matyášek | AlphaGO, a deep RL application in games | {{:courses:xp36vpd:deep-rl-all.pdf|}} | | | Nov 3 | Petr Čížek | Managing and mining (streaming) sensor data | {{:courses:xp36vpd:stream_mining_cizek.pdf|}} | | 6 | Nov 10 | Jáchym Barvínek | Learning structure-activity models | {{:courses:xp36vpd:qsar.pdf|}} | | | Nov 10 | Jan Šimbera | Non linear dimensionality reduction | {{:courses:xp36vpd:dimred.pdf|}} | | 7 | Nov 17 | | Holiday | | | 8 | Nov 24 | Filip Paulů | Data mining applications in manufacturing | cancelled | | | Nov 24 | Marek Cuchý | Mining social networks | | | 9 | Dec 1 | Karel Horák | Deep learning: methods and applications | {{:courses:xp36vpd:deeplearning.pdf|}} | | | Dec 1 | Martin Svatoš | Rule extraction from neural networks | {{:courses:xp36vpd:vpd_ruleextraction.pdf|}} | | 10 | Dec 8 | Petr Ryšavý | Recommender systems | {{:courses:xp36vpd:20161202_vpd_recommendersystems.pdf|recommendersystems.pdf}} | | | Dec 8 | Petr Čížek | Learning from various sensory modalities | {{:courses:xp36vpd:sensory.pdf|}} | | 11 | Dec 15 | Jan Mrkos | Non-convex optimization in machine learning | {{:courses:xp36vpd:janmrkos-non-convex_optimization_in_ml.pdf|non-convex.pdf}} | | 12 | Jan 5 | Jáchym Barvínek | Abductive logic programming for metabolic network learning | {{:courses:xp36vpd:abductive.pdf|}} | | | Jan 5 | Jan Šimbera | Machine learning in geography | {{:courses:xp36vpd:geo.pdf|}} | | 13 | Jan 12 | Marek Cuchý | Advertising on the Web | | | | Jan 12 | Petr Váňa | Reinforcement learning in robotics, part II | | | | Jan 12 | JK, FZ | **zkouška** | | ===== References ===== * Rajaraman, A., Leskovec, J., Ullman, J. D.: [[http://www.mmds.org/|Mining of Massive Datasets]], Cambridge University Press, 2011. * [[http://bigdata-madesimple.com/27-free-data-mining-books/|Free Data mining Books]] * Recent tutorials, major ML/DM conferences: [[http://www.kdd.org/kdd2014/tutorials.html|KDD2014]], [[http://ds2014.ijs.si/index.php?page=invited|DS2014]], [[http://icml.cc/2016/?page_id=97|ICML16]], [[http://www.ecmlpkdd2016.org/program.html|ECML16]] * Review papers: [[http://www.cs.uvm.edu/~icdm/10Problems/10Problems-06.pdf|Yang, Wu: 10 CHALLENGING PROBLEMS IN DATA MINING RESEARCH]] * External seminars: [[http://ai.ms.mff.cuni.cz/~sui/|ML seminars at MFF]], [[http://praguecomputerscience.cz/|PIS]], [[ http://www.mlmu.cz/program/|Machine Learning Meetups]]. ===== Links ===== * Lecturers: [[http://ida.felk.cvut.cz/klema/|Jiří Kléma]], [[http://ida.felk.cvut.cz/zelezny/|Filip Železný]] * [[https://www.fel.cvut.cz/cz/education/rozvrhy-ng.B171/public/cz/predmety/29/68/p2968906.html|Class schedule]]. * [[https://www.fel.cvut.cz/cz/education/bk/predmety/29/68/p2968906|Course syllabus]]. ===== Evaluation, requirements ===== * every student must give his talks (the principle requirement in this type of course), * attendance and active discussion at presentations of other students, * pass the exam, i.e., prove the knowledge of basic concepts presented during the course.