Data Mining (B-KUL-H02C6A)

4 ECTSEnglish37 Second termCannot be taken as part of an examination contract
POC Artificial Intelligence

Today it is possible to collect and store vast quantities of data. These data often contain value information and insights. However, it may take human analysists weeks or months to discover the information if they are able to do it at all. Furthermore, so much data exist that most of it is never even analyzed. The goal of data mining is to fill this void by automatically identify models and patterns from these databases that are (1) valid, that is, they hold on new data with some certainty, (2) novel, that is, they are non-obvious, (3) useful, that is, they are actionable, and (4) understandable. that is humans can interpret them. In order to do this, data mining, also called knowledge discovery in databases (KDD), combines ideas from the fields of machine learning, databases, statistics, visualization, and many other fields.

The goal of this course is to provide a broad survey of several important and well-know fields of data mining and to develop an overall sense of how to extract information from data in a systematic way. It tries to give inisght into the challenges faced by data miners and the inner workers of specific data mining algorithms as well as provide some understanding about why data mining is important and interesting. The course consists of lectures, readings and exercises sessions. The exercise sessions reinforce the central concepts covered during class and give students some experience working with publicly available data mining tools. The course requires knowledge of machine learning.

Bachelor or Master level with at least basic knowledge of computers, algorithms and data structures. Moreover, the students should be comfortable with mathematical concepts such as differentiation, probability and statistics.


Knowledge of Machine Learning techniques. Specifically, the student must have followed either the (1) "Machine Learning and Inductive Inference" (B-KUL-H02C1A) class or (2) Beginselen van machine learning (B-KUL-H0E96A) / Principles of Machine Learning (B-KUL-H0E98A) class. Or they must have followed a course that was deemed to be equivalent.


This course unit is a prerequisite for taking the following course units:
H00Y4A : Big Data Analytics Programming

Activities

3.2 ects. Data Mining: Lecture (B-KUL-H02C6a)

3.2 ECTSEnglishFormat: Lecture17 Second term
POC Artificial Intelligence

Topics covered include (not necessarily in this order):

1) Data mining overview

2) The data mining process

3) Recommender systems

4) Association rule mining

5) Sequential pattern mining

6) Clustering

7) Large scale decision tree learning

8) Advanced topics on ensemble methods

9) Using unlabeled data

10) Data streams

11) Advanced topics (time permitting)

0.8 ects. Data Mining: Practical Sessions (B-KUL-H00I0a)

0.8 ECTSEnglishFormat: Practical20 Second term
POC Artificial Intelligence

The exercise sessions reinforce the central concepts covered during class and give students some experience working with publicly available data mining tools.  More specifically, tasks many include:

1) Working through the control of an algorithm to better understand how it functions

2) Implementing a small part of an algorithm

3) Working through a small part of the data mining process

4) Using Weka to analyse data

5) Theoretical questions designed to extend a student's knowledge of the subject

6) Discussing and solving a data mining problem with a small group and presenting the conclusions of the discussion to the whole exercise session

Evaluation

Evaluation: Data Mining (B-KUL-H22C6a)

Type : Exam during the examination period
Description of evaluation : Written
Type of questions : Closed questions, Open questions
Learning material : None


Closed book written exam about the topics covered in the lectures, exercise sessions and reading.  The goal will be to assess two questions:

1) Do you understand the important basic concepts covered in class

2) Do you have an advanced understanding of the topics covered

Some questions will be similar in spirit to those solved in the exercise sessions while others will ask a student to apply a learned concept in a different context.  Be sure to read all the questions carefully and to think about how the answer to each question is structured.