Methods of Corpus Linguistics (B-KUL-F0TU1A)
Aims
Cognitive and functional approaches to linguistics typically take the form of usage-based models of language: they assume that language should not be studied in isolation, but in the context of actual communicative interactions. Methodologically speaking, this implies that corpus linguistics is an important tool for work within the cognitive-functional framework. Methods and techniques for dealing with the large collections of usage data that are found in linguistic corpora are an indispensible part of the equipment of cognitive and functional linguists.
In this course, students will become acquainted with the use of corpus techniques in the context of cognitive and functional theory development. The purpose is double: on the one hand, to introduce a number of advanced techniques of corpus analysis, and on the other, to demonstrate how these techniques contribute to theory development.
Is included in these courses of study
- Master in de taalkunde (Leuven) 60 ects.
- Master in de taal- en letterkunde (Leuven) 60 ects.
- Master of Statistics and Data Science (on campus) (Leuven) (Statistics and Data Science for Social, Behavioral and Educational Sciences) 120 ects.
- Courses for Exchange Students Faculty of Arts (Leuven)
- Master of Digital Humanities (Leuven) 60 ects.
- Master of Advanced Studies in Linguistics (Programme for students started before 2024-2025) (Leuven et al) 60 ects.
- Educatieve master in de talen (Leuven) 120 ects.
Activities
6 ects. Methods of Corpus Linguistics (B-KUL-F0TU1a)
Content
The purpose of the course is achieved by concentrating on two points of theoretical focus. In each case, a specific type of theoretical problem is approached on a case study basis, and analytic techniques that are specifically suited to deal with that type of theoretical problem are introduced in the context of the case study.
The first problem to be addressed concerns the distribution of specific constructions. The way in which a specific construction, like an existential construction or a particular word order pattern, occurs in actual language use is often co-determined by various factors. It is not just the semantics, i.e. what is being expressed, that determines whether a given construction rather than another surfaces, but also the way in which the discourse is organized or the specific (stylistic or sociolinguistic) features of the text may influence the presence of a construction. The theoretical problem to be solved involves the impact of these various factors - grammatical, discursive, sociostylistic: how can one disentangle these ? The relevant quantitative technique that we will introduce is logistic regression.
The second problem type involves the relationship between language varieties. When we are confronted with differences in language use of a stylistic, sociolinguistic, geographical nature, we will want to determine which language varieties to distinguish exactly: which stylistic levels, for instance, should we distinguish ? And what are the relations that exist between the different dialects of a language ? The techniques that we will introduce to deal with such questions involve measures of the linguistic distance between language varieties (how close are two varieties with regard to each other ?), and multidimensional scaling methods (how do the varieties cluster together ?)
With regard to both problem types, the relevant technical apparatus will be introduced in the form of user-friendly computer tools that ensure that the students may apply the techniques independently.
Course material
Presentations and digital course textbook (accessible via Toledo).