CS423: Data Mining (1/2016)

Lecturer: Jakramate Bootkrajang
Office : 17.00-18.00 Tuesday,Friday @ CSB 107

Class mark (as of 06/09/16) Google spreadsheet
Welcome to course's homepage
New version of Basic concepts' slides are coming
Updated: Basic concepts' slides
Syllabus [PDF]
I. Introduction + Basic concepts [Slides-Intro] [Slides-Concept]
II. Data preprocessing [Slides] Supplementary reading
Data cleaning
Data integration
Data transformation
III. Dimensionality Reduction Techniques
PCA [Slides] [Eigenvectors]
Feature subset selection [Slides]
Random projection [Slides]
Dimensionality reduction lab [Slides] [Dataset]
VI. Mining Association Rules and Recommendation system
Concept [Slides]
Apriori Algorithm [Original paper]
Recommendation system [Slides]
V. Classification
Bayesian Learning [Slides]
Linear Discriminant Analysis [Slides]
Logistic Regression [Slides]
K-nearest neighbours [Slides]
Classifier evaluation [Slides]
VI. Clustering
I will use nicely put together materials by David M. Blei which can be obtained from the provided links.
Partitioning clustering -- k-means, k-medoids [Slides1]
Hierachical clustering -- agglomerative, divisive [Slides2]

Assignment 1. (due 16 Sep.) [PDF] All datasets
Assignment 2. (due 11 Oct.) [PDF] Code template download
Assignment 3. (due 15 Nov.) [PDF]
Assignment 4. (due 29 Nov.) [PDF]

Useful resources
Data Mining, Concepts and Techniques: Jiawei Han and Micheline Kamber, MORGAN KAUFMANN.
Data Mining, Concepts, Models, Methods, and Algorithms: Mehmed Kantardzic, WILEY-INTERSCIENCE
Pattern Classification: Peter E. Hart, David G. Stork, and Richard O. Duda, WILEY
Mining From Massive Data: Anand Rajaraman, Jure Leskovec, Jeffrey D. Ullman [PDF]
Lecture note (in Thai)
Data mining [PDF]
I.Guyon and A. Elisseeff, An Introduction to Variable and Feature Selection, JMLR (2003)
CC.Aggarwal, A.Hinneburg, DA. Keim, On the Surprising Behavior of Distance Metrics in High Dimensional Space, ICDT (2001)
Online Material
Introduction to Scilab
Distance measures summarisation
Back to Top