Reinforcement Learning (B-KUL-H0O23A)
Aims
This course will familiarise the students with the domains of planning and reinforcement learning, which is concerned with sequential decision making and learning in intelligent agents.
After following this course, students will
- have a deep understanding of Markov Decision Processes and their role in planning and reinforcement learning,
- understand different settings studied in AI, especially in sequential decision making and reinforcement learning,
including full vs partial observability, online vs offline, model-based vs model-free, single vs multi-agent, and Markovian vs non-Markovian.
- have an overview of the existing techniques and algorithms for planning and reinforcement learning under different conditions,
- understand how these techniques work, why they work, and when they work,
- be able to incorporate these techniques into intelligent agents, AI systems, and their applications,
- be up-to-date with the current state of the art and be able to familiarize himself with new research results in the area.
Previous knowledge
Knowledge of Machine Learning, and Neural Networks
Order of Enrolment
Mixed prerequisite:
You may only take this course if you comply with the prerequisites. Prerequisites can be strict or flexible, or can imply simultaneity. A degree level can be also be a prerequisite.
Explanation:
STRICT: You may only take this course if you have passed or applied tolerance for the courses for which this condition is set.
FLEXIBLE: You may only take this course if you have previously taken the courses for which this condition is set.
SIMULTANEOUS: You may only take this course if you also take the courses for which this condition is set (or have taken them previously).
DEGREE: You may only take this course if you have obtained this degree level.
SIMULTANEOUS(H02C1A) OR SIMULTANEOUS(H0E96A)
The codes of the course units mentioned above correspond to the following course descriptions:
H02C1A : Machine Learning and Inductive Inference
H0E96A : Beginselen van machine learning
Is included in these courses of study
- Master of Artificial Intelligence (Leuven) (Specialisation: Engineering and Computer Science (ECS)) 60 ects.
- Master in de ingenieurswetenschappen: artificiële intelligentie (Leuven) 120 ects.
Activities
3 ects. Reinforcement Learning: Lecture (B-KUL-H0O23a)
Content
Introduction to planning and reinforcement learning
Multi-armed bandits and their algorithms
-- exploration vs exploitation
-- rewards and regret
-- greedy algorithms
-- upper confidence bounds
Markov Decision Processes and their variants
-- Bellman Equations
-- Policies and value functions
-- Optimality
-- Partial and full observability
Dynamic Programming
-- Policy evaluation, improvement and iteration
-- Value iteration
Monte Carlo Methods
Temporal-difference learning
- TD Prediction
- Q-learning
- Sarsa
- On-policy vs off-policy
- n-Step bootstrapping
Planning and learning with tabular methods
-- Dyna : integrated planning, acting and learning
-- Real time dynamic programming
-- Monte-Carlo tree search
Approximate methods
-- Value function approximation
-- Gradient methods
-- on-policy and off-policy variants
Policy gradient methods
-- Policy approximation
-- Policy gradients
-- Actor Critic
Contemporary topics
- Deep Reinforcement learning
- multi-agent reinforcement learning
- shielding and safe reinforcement learning
- relational reinforcement learning and traditional planning
Applications in game playing and beyond
Course material
Sutton and Barto, Reinforcement learning: an Introduction, 2nd Edition.
Additional materials on Toledo.
1 ects. Reinforcement Learning: Exercises (B-KUL-H0O24a)
Content
6 sessions of 2.5 hours and some assignments
The exercise sessions practice the concepts, models and techniques seen in the lectures.
Course material
The exercise material will be made available on Toledo
Evaluation
Evaluation: Reinforcement Learning (B-KUL-H2O23a)
Explanation
The evaluation consists of a written exam in the exam period and permanent evaluation during the semester:
The closed book exam consists of a theoretical part and an exercise part
The permanent evaluation part involves applying the material seen in the lectures and exercises in a new context (practical)
Information about retaking exams
The result of the permanent evaluation is carried over to the third examination period, but not to a following academic year.