Reinforcement Learning

All programmes > Reinforcement Learning

Reinforcement Learning (B-KUL-H0O23A)

4 ECTS

English

Second term

Cannot be taken as part of an examination contract

Marra Giuseppe

POC ir. Artificiële Intelligentie

Aims

This course will familiarise the students with the domains of planning and reinforcement learning, which is concerned with sequential decision making and learning in intelligent agents.

After following this course, students will

- have a deep understanding of Markov Decision Processes and their role in planning and reinforcement learning,

- understand different settings studied in AI, especially in sequential decision making and reinforcement learning,

including full vs partial observability, online vs offline, model-based vs model-free, single vs multi-agent, and Markovian vs non-Markovian.

- have an overview of the existing techniques and algorithms for planning and reinforcement learning under different conditions,

- understand how these techniques work, why they work, and when they work,

- be able to incorporate these techniques into intelligent agents, AI systems, and their applications,

- be up-to-date with the current state of the art and be able to familiarize himself with new research results in the area.

Previous knowledge

Knowledge of Machine Learning, and Neural Networks

Order of Enrolment

Mixed prerequisite:
You may only take this course if you comply with the prerequisites. Prerequisites can be strict or flexible, or can imply simultaneity. A degree level can be also be a prerequisite.
Explanation:
STRICT: You may only take this course if you have passed or applied tolerance for the courses for which this condition is set.
FLEXIBLE: You may only take this course if you have previously taken the courses for which this condition is set.
SIMULTANEOUS: You may only take this course if you also take the courses for which this condition is set (or have taken them previously).
DEGREE: You may only take this course if you have obtained this degree level.

SIMULTANEOUS(H02C1A) OR SIMULTANEOUS(H0E96A)

The codes of the course units mentioned above correspond to the following course descriptions:
H02C1A : Machine Learning and Inductive Inference
H0E96A : Beginselen van machine learning

Is included in these courses of study

Master of Artificial Intelligence (Leuven) (Specialisation: Engineering and Computer Science (ECS)) 60 ects.
Master in de ingenieurswetenschappen: artificiële intelligentie (Leuven) 120 ects.

expand

Activities

3 ects. Reinforcement Learning: Lecture (B-KUL-H0O23a)

3 ECTS

EnglishFormat: Lecture

Second term

Marra Giuseppe

POC ir. Artificiële Intelligentie

Content

Introduction to planning and reinforcement learning

Multi-armed bandits and their algorithms

-- exploration vs exploitation

-- rewards and regret

-- greedy algorithms

-- upper confidence bounds

Markov Decision Processes and their variants

-- Bellman Equations

-- Policies and value functions

-- Optimality

-- Partial and full observability

Dynamic Programming

-- Policy evaluation, improvement and iteration

-- Value iteration

Monte Carlo Methods

Temporal-difference learning

- TD Prediction

- Q-learning

- Sarsa

- On-policy vs off-policy

- n-Step bootstrapping

Planning and learning with tabular methods

-- Dyna : integrated planning, acting and learning

-- Real time dynamic programming

-- Monte-Carlo tree search

Approximate methods

-- Value function approximation

-- Gradient methods

-- on-policy and off-policy variants

Policy gradient methods

-- Policy approximation

-- Policy gradients

-- Actor Critic

Contemporary topics

- Deep Reinforcement learning

- multi-agent reinforcement learning

- shielding and safe reinforcement learning

- relational reinforcement learning and traditional planning

Applications in game playing and beyond

Course material

Sutton and Barto, Reinforcement learning: an Introduction, 2nd Edition.

Additional materials on Toledo.

1 ects. Reinforcement Learning: Exercises (B-KUL-H0O24a)

1 ECTS

EnglishFormat: Practical

Second term

Marra Giuseppe

POC ir. Artificiële Intelligentie

Content

6 sessions of 2.5 hours and some assignments

The exercise sessions practice the concepts, models and techniques seen in the lectures.

Course material

The exercise material will be made available on Toledo

Evaluation

Evaluation: Reinforcement Learning (B-KUL-H2O23a)

Type : Partial or continuous assessment with (final) exam during the examination period

Description of evaluation : Written, Paper/Project

Type of questions : Multiple choice, Open questions, Closed questions

Learning material : None

Explanation

The evaluation consists of a written exam in the exam period and permanent evaluation during the semester:

The closed book exam consists of a theoretical part and an exercise part

The permanent evaluation part involves applying the material seen in the lectures and exercises in a new context (practical)

Information about retaking exams

The result of the permanent evaluation is carried over to the third examination period, but not to a following academic year.

Share this page

Translations