Modern Data Analytics (B-KUL-G0Z39B)

4 ECTSEnglish20 Second termCannot be taken as part of an examination contract
POC Master in statistiek

After having completed this course, the student will have acquired the necessary practical skillset and theoretical knowledge to deal with a wide variety of data science-related tasks. The course will equip the student with a solid set of tools to successfully approach his/her Master’s thesis and/or any other assignment in the programme Master of Statistics or beyond.

Each lecture is set up as an interactive workshop requiring active participation of the student and pre-course notes to be read.

 

Python will be the main program language used throughout this course.

Skills: the student should be able to analyse, synthesise and interpret.

Knowledge

  • Experience with at least one programming language
  • Fundamental concepts of statistics

This course is identical to the following courses:
G0Z39C : Modern Data Analytics

Activities

4 ects. Modern Data Analytics (B-KUL-G0Z39a)

4 ECTSEnglishFormat: Lecture20 Second term
POC Master in statistiek

Introduction to Python: Jupyter Notebooks, Broadcasting, Indexing, Python Package Manager, Graphs, Functions, Control structures, Graphs

1.Python Advanced DNA

  • Object oriented programming in Python
  • Managing python environments on a single computer.
  • Working as a team and collaborate on Github
  • Plotly-graphs

2.Basics of Machine Learning

We briefly discuss some basic concepts of Machine Learning (ML). Through the use of examples, we go through several common applications and pitfalls of ML. This should be used as a starting point for students with a limited background in ML.

  • Supervised Learning: We start by discussing the idea behind regression analysis and how a regression problem is tackled in practice. Along the way, we emphasize the implications and dangers of each step in the modeling process. We delve deeper into the bias-variance trade-off and discuss some popular methods used to control this trade-off. We finish by discussing classification problems.

  • Unsupervised Learning: We provide some important examples commonly used for different unsupervised learning tasks. We look at clustering problems, outlier detection, and dimension reduction. 

3.Building Machine Learning Pipelines

In this lecture, we learn how to use the models in the Scikit-Learn package. We go through several examples to familiarise ourselves with the API. We showcase some examples of preprocessing such as PCA, standardization and one-hot encoding. Finally, we introduce the notion of transformers and pipelines which are tools that can greatly improve code readability and structure.

4. Building & Deploying Apps

5.Analytics Infrastructure

In this lecture, we kick off the third part of the course which focuses on the building backs of a data-driven IT architecture. We focus on exploring some key elements of data analytics infrastructure and further introduce two commonly used tools for managing machine learning projects in particular.

6.Version Control and Code Repositories

The following lecture covers the basics of Git and GitHub. The goal of the lecture is to familiarise the student with the core concepts of Version Control and repositories and go over some basic examples using Git(Hub) in practice.

7.Cloud Computing

In this section we explain how to use the cloud (AWS) for data-science assignments. The following topics are covered

  • storage
  • user management
  • python - interaction
  • computation engines

 

Slides will be provided on Toledo

Evaluation

Evaluation: Modern Data Analytics (B-KUL-G2Z39b)

Type : Partial or continuous assessment with (final) exam during the examination period
Description of evaluation : Oral, Paper/Project, Report
Type of questions : Open questions
Learning material : Computer


Partial or continuous assessment with (final) exam during the examination period.

Continuous evaluation where students will deliver in team 2 projects.

Oral exam where the projects are discussed. Use of Laptop is permitted.

Oral exam, use of Laptop is permitted.