Corpus-Based Methods
The course covers data, methods and evidence in different linguistic traditions. It also explores quantitative properties of language, for example frequencies, and n-grams.
The course also gives an overview of computational linguistic methods for automatic segmentation and annotation of text – including tokenisation, part-of-speech tagging and syntactic analysis – and describe how to search corpora using regular expressions.
We will also analyze corpora, based on occurrences and co-occurrences and the relationship between corpus material and research questions, ethics, copyright, and licenses.
Teaching Format
The course is based on lectures and laborations.
Assessment
The course is examined through written exams and reports.
Examiner
You are welcome to contact our Student Office!





