## CS423: Data Mining (1/2016)Lecturer: Jakramate Bootkrajang Office : 17.00-18.00 Tuesday,Friday @ CSB 107 Email: jakramate.b@cmu.ac.th |

- Class mark (as of 06/09/16) Google spreadsheet
- Welcome to course's homepage
- New version of Basic concepts' slides are coming
- Updated: Basic concepts' slides

- I. Introduction + Basic concepts [Slides-Intro] [Slides-Concept]
- II. Data preprocessing [Slides] Supplementary reading
- Data cleaning
- Data integration
- Data transformation
- III. Dimensionality Reduction Techniques
- PCA [Slides] [Eigenvectors]
- Feature subset selection [Slides]
- Random projection [Slides]
- Dimensionality reduction lab [Slides] [Dataset]
- VI. Mining Association Rules and Recommendation system
- Concept [Slides]
- Apriori Algorithm [Original paper]
- Recommendation system [Slides]
- V. Classification
- Bayesian Learning [Slides]
- Linear Discriminant Analysis [Slides]
- Logistic Regression [Slides]
- K-nearest neighbours [Slides]
- Classifier evaluation [Slides]
- VI. Clustering
- I will use nicely put together materials by David M. Blei which can be obtained from the provided links.
- Partitioning clustering -- k-means, k-medoids [Slides1]
- Hierachical clustering -- agglomerative, divisive [Slides2]

- Books
- Data Mining, Concepts and Techniques: Jiawei Han and Micheline Kamber, MORGAN KAUFMANN.
- Data Mining, Concepts, Models, Methods, and Algorithms: Mehmed Kantardzic, WILEY-INTERSCIENCE
- Pattern Classification: Peter E. Hart, David G. Stork, and Richard O. Duda, WILEY
- Mining From Massive Data: Anand Rajaraman, Jure Leskovec, Jeffrey D. Ullman [PDF]
- Lecture note (in Thai)
- Data mining [PDF]
- Papars
- I.Guyon and A. Elisseeff, An Introduction to Variable and Feature Selection, JMLR (2003)
- CC.Aggarwal, A.Hinneburg, DA. Keim, On the Surprising Behavior of Distance Metrics in High Dimensional Space, ICDT (2001)
- Online Material
- Introduction to Scilab
- Distance measures summarisation