Advanced Natural Language Processing

All programmes > Advanced Natural Language Processing

Advanced Natural Language Processing (B-KUL-H02B1B)

6 ECTS

English

First term

Cannot be taken as part of an examination contract

de Lhoneux Miryam | N.

POC ir. Artificiële Intelligentie

Aims

The course focuses on an in-depth understanding of methods and algorithms for building computer software that understands, generates and manipulates human language. We study the algorithms and models while introducing core tasks in natural language processing (NLP), including language modeling, syntactic analysis, semantic interpretation, machine translation, coreference resolution, discourse analysis, machine reading, question answering and dialogue modeling. We illustrate the methods and technologies with current applications in real world settings. After following this course, the student has acquired an in-depth understanding of contemporary machine learning models designed for processing human language and of the underlying computational properties of NLP models. The student will have learned how underlying linguistic phenomena, that is, the linguistic features, can be modeled and automatically learned from data using deep learning techniques. He or she will be able to understand papers in the NLP field.

Previous knowledge

Basics of linear algebra and probability theory; foundations of machine learning; computer programming.

Is included in these courses of study

Master in de ingenieurswetenschappen: artificiële intelligentie (Leuven) 120 ects.

expand

Activities

3.5 ects. Natural Language Processing: Lecture (B-KUL-H02B1a)

3.5 ECTS

EnglishFormat: Lecture

First term

de Lhoneux Miryam

POC Artificial Intelligence

Content

1. Introduction

What is natural language processing (NLP)?
Current state-of-the-art of NLP
Ambiguity
Other challenges
Representing words, phrases and sentences

2. Segmentation and tokenization

Regular expressions
Word tokenization, lemmatization and stemming
Sentence segmentation
Subword tokenization

3. Language Modelling

N-gram language models
perplexity
maximum likelihood estimation
smoothing

4. Neural Language Modelling

Word embeddings
Vector space models for NLP
Recurrent neural network (RNN) for language modelling
Transformer architecture for language modelling
Use of language models in downstream tasks: fine-tuning and pretraining

5. Part-of-Speech (POS) Tagging

Hidden Markov model and viterbi
Conditional Random Fields
(Bi)LSTM for POS tagging
Encoder-decoder architecture for sequence-to-sequence labeling

6. Morphological analysis

Inflection and derivation
Finite state morphology
Sequence-to-sequence neural models of morphological inflection

7. Syntactic Parsing

Universal Dependencies
Dependency parsing: Graph based dependency parsing, transition based dependency parsing
Constituent parsing with a (probabilistic) context free grammar ((P)CFG) and the Cocke-Younger-Kasami (CYK) algorithm

8. Semantics (lexical and compositional)

Word sense disambiguation
Semantic role labelling

9. Discourse: Coreference Resolution

Discourse coherence
Algorithm of Hobbs
Neural end-to-end coreference resolution

10. Question Answering

Evolution of QA systems from rule-based to neural
Complex pipelines to end-to-end to retrieval-free
Closed-domain vs open-domain
Text-only vs multimodal

11. Neural Machine Translation

Encoder-decoder architecture (e.g., RNN, transformer-based)
Attention models
Improvements and alternative architectures that deal with limited parallel training data

12. Conversational Dialogue Systems and Chatbots

Task oriented dialog agents: Rule based versus neural based approaches
Chatbots: End-to-end sequence-to-sequence neural models

Course material

Handbooks

Daniel Jurafsky and James H. Martin. 2024. Speech and Language Processing: An Introduction to Natural
Language Processing, Computational Linguistics, and Speech Recognition with Language Models, 3rd edition.

Jacob Eisenstein. 2019. Introduction to Natural Language Processing. MIT Press.

Yoav Goldberg. 2016. A Primer on Neural Network Models for Natural Language Processing.

Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press.

+ recent articles: e.g., of the proceedings of the Meetings of the ACL, AAAI, NeurIPS.

Format: more information

Interactive lectures with short exercises.

Is also included in other courses

0.5 ects. Natural Language Processing: Exercises (B-KUL-H00G0a)

0.5 ECTS

EnglishFormat: Practical

First term

de Lhoneux Miryam

POC Artificial Intelligence

Content

Exercises on tokenization and segmentation
Exercises on language modelling and POS tagging
Exercises on syntactic parsing
Exercises on semantic and discourse processing
Exercises on machine translation
Exercises on question answering

Is also included in other courses

2 ects. Natural Language Processing: Project (B-KUL-H0O15a)

2 ECTS

EnglishFormat: Assignment

First term

POC ir. Artificiële Intelligentie

Content

The project focuses on gaining fundamental insights in advanced aspects of natural language processing especially with regard to learning representations of language data and solving a complex NLP task such as question answering, dialogue understanding and generation, or machine reading, and their involved subtasks. The assignment is a programming assignment which is given to the students in separate parts.

Course material

Assignment, explanations and documentation can be downloaded from the Toledo platform of KU Leuven

http://toledo.kuleuven.be

Format: more information

Computer session - Individual assignment - Project work

Evaluation

Evaluation: Advanced Natural Language Processing (B-KUL-H22B1b)

Type : Partial or continuous assessment with (final) exam during the examination period

Description of evaluation : Written, Project/Product, Report

Type of questions : Open questions, Closed questions

Learning material : Course material, Computer

Information about retaking exams

Information about retaking exams

There is a second exam opportunity except for the project work.

Share this page

Translations