Audio and Speech Processing (B-KUL-H09K5A)
Aims
Acquire insights in audio and speech processing.
Know, understand and applying mathematical models for speech signals in answering questions related to design and research. Gain insight into the structure of speech signals. Understand and apply hidden Markov model theory and related estimation probems. To recognize, understand and design the components of a speech coders, recognizers and synthesizers.
Understand and apply basic principles and algorithms for audio processing, in particular for recording and playback of digital audio, beamforming, noise reduction, active noise control, echo- and feedback cancellation, etc.
Previous knowledge
- Digital signal processing: DFT/FFT, filtering, z-transform, optimal and adaptive filters.
- Stochastic signals: multi-dimensional distribution functions, verdelingen, moments of distribution functions
- Mathematics: optimization, constrained optimization.
Order of Enrolment
Mixed prerequisite:
You may only take this course if you comply with the prerequisites. Prerequisites can be strict or flexible, or can imply simultaneity. A degree level can be also be a prerequisite.
Explanation:
STRICT: You may only take this course if you have passed or applied tolerance for the courses for which this condition is set.
FLEXIBLE: You may only take this course if you have previously taken the courses for which this condition is set.
SIMULTANEOUS: You may only take this course if you also take the courses for which this condition is set (or have taken them previously).
DEGREE: You may only take this course if you have obtained this degree level.
FLEXIBLE (H05F1A)
The codes of the course units mentioned above correspond to the following course descriptions:
H05F1A : Digital Signal Processing for Communications and Information Systems
Identical courses
This course is identical to the following courses:
H05C3A : Audio- en spraakverwerking (No longer offered this academic year)
Is included in these courses of study
- Master of Biomedical Engineering (Programme for students started before 2021-2022) (Leuven) 120 ects.
- Courses for Exchange Students Faculty of Engineering Science (Leuven)
- Master in de ingenieurswetenschappen: elektrotechniek (Leuven) (Informatiesystemen en signaalverwerking) 120 ects.
- Master of Electrical Engineering (Leuven) (Information Systems and Signal Processing) 120 ects.
- Master of Biomedical Engineering (Programme for students started in 2021-2022 or later) (Leuven) 120 ects.
Activities
2.41 ects. Audio Processing: Lecture (B-KUL-H09K6a)
Content
Chapter 1: Introduction
Chapter 2: Noise Reduction
Chapter 3: Fixed Beamforming
Chapter 4: Adaptive Beamforming & Multi-Channel Noise Reduction
Chapter 5: Acoustic Echo and Feedback Cancellation
Chapter 6: Sound Field Control
Chapter 7: Distributed Audio Signal Processing
Chapter 8: Sound Field Recording and Reproduction (guest lecture, attendance is mandatory)
Course material
Study cost: 1-10 euros (The information about the study costs as stated here gives an indication and only represents the costs for purchasing new materials. There might be some electronic or second-hand copies available as well. You can use LIMO to check whether the textbook is available in the library. Any potential printing costs and optional course material are not included in this price.)
Lecture slides and supporting textbooks.
0.59 ects. Audio Processing: Exercises and Laboratory Sessions (B-KUL-H09K7a)
Content
Design of a concrete real-time signal processing system for recording/playback of digital audio, based on concepts studied in the lectures.
Course material
Handouts and manuals
Format: more information
Four supervised lab sessions plus homework.
2.41 ects. Speech Processing: Lecture (B-KUL-H09K8a)
Content
Part 1 Speech processing
Speech models at multiple levels:
* phonetics
* source-filter model, formants and linear predictive coding (LPC)
* feature vector extraction: estimation of LPC parameters, the Levinson-Durbin algorithm, short-term Fourier transfors, Mel-spectra, cepstra,
Speech coding:
* LPC-based coders: CELP, MELP, RELP, RPE
* perceptial coders including MP3
Speech recognition:
* Bayesian formulation, definition of Hidden Markov Models (HMM)
* likeihood of data under the HMM assumption
* HMM topology
* parameter estimation in HMMs
* the VIterbi algorithm
Taalmodellering
* N-grammen
* model estimation: maximum likelihood, leaving-one-out
* context-free grammars
Speech synthesis
* natural language processing, grapheme-to-phoneme conversion
* signal processing: source-filter synthesis, concatenative synthesis, the PSOLA-algorithm
* synthesis with HMMs
Course material
Study cost: 1-10 euros (The information about the study costs as stated here gives an indication and only represents the costs for purchasing new materials. There might be some electronic or second-hand copies available as well. You can use LIMO to check whether the textbook is available in the library. Any potential printing costs and optional course material are not included in this price.)
0.59 ects. Speech Processing: Exercises and Laboratory Sessions (B-KUL-H09K9a)
Content
A project assigment, in which students will apply the concepts discussed in the lectures. Students will hand in their report which will be graded.
Evaluation
Evaluation: Audio and Speech Processing (B-KUL-H29K5a)
Explanation
- The project work is graded during the lab sessions and also based on submitted reports, and accounts for 25% of the grade.
- The written exam is an open book exam during the examination period and accounts for 75% of the grade.
Information about retaking exams
The retake exam (in the 3rd exam period) has additional questions on the project work.