K6225 Knowledge Discovery & Data Mining

What the instructor, Dr Chris Khoo, says about the course:

There are two data mining courses offered in the School:

  • a technical course offered in the MSc Information Systems programme (CI6227 Data Mining)
  • a practical course (this course, K6225) using a how-to-do-it, how-does-it-work and how-to-apply-it kind of approach, with a minimum of mathematics.

This course seeks to develop the student’s commonsense ability to manipulate data from different angles. When I first taught this course more than 10 years ago, I focused on teaching methods and techniques, expecting students to be able to use commonsense to apply them. I was horrified at the end of the semester to find in the term reports and exam answers that students had many misconceptions and was applying the methods incorrectly. I gradually learnt then “commonsense” is actually uncommon, and that data analysis is an art, requiring knowledge, skill and creativity. The course now adopts a more problem-based approach where students analyse a particular dataset throughout the course of the semester — each week applying the technique they have learnt in class. Every week, 2 or 3 groups of students give a 3 minute presentation of their data analysis results — so that I can point out misconceptions, how the analysis can be improved, and subtleties not highlighted in the lecture material.

This semester, I’m experimenting with social media to supplement classroom interaction. A blog and twitter account will be set up for students to send comments, questions and reflections. This is in addition to the discussion forum in EdveNTUre.

The first half of the semester is devoted to statistical analysis, and the second half to machine-learning methods. This is because there is no separate statistics course in the KM programme, and I think it is dangerous to go into an organisation to do data mining without knowing basic statistical analysis.

PhD students have found this a good substitute for a stats course. I must caution students though the course doesn’t cover experimental design and Analysis of Variance, which PhD students doing quantitative research should know. (Courses on ANOVA are available in the Psychology Division.)

Leave a Reply

Your email address will not be published. Required fields are marked *