Mats Wirén Photo: C Stensson

Mats Wirén


Works at Department of Linguistics
Telephone 08-16 12 44
Visiting address Universitetsvägen 10 C, plan 2-3
Room C 361
Postal address Institutionen för lingvistik 106 91 Stockholm

About me

Head of the Section for Computational Linguistics




Corpus-based Methods, LIM024, 7,5 ECTS credits [in English]
Linguistic Methodology, LIN211, 7,5 ECTS credits [in Swedish]
Mathematical Methods for Linguists, LIN433, 7.5 ECTS credits [in Swedish]
Independent Project for the Degree of Bachelor, LIN600/LIN601, 15 ECTS credits [in Swedish]




A selection from Stockholm University publication database
  • 2019. Mats Wirén (et al.). Selected papers from the CLARIN Annual Conference 2018, Pisa, 8-10 October 2018, 222-234

    Annotation of second-language learner text is a cumbersome manual task which in turn requires interpretation to postulate the intended meaning of the learner’s language. This paper describes SVALA, a tool which separates the logical steps in this process while providing rich visual support for each of them. The first step is to pseudonymize the learner text to fulfil the legal and ethical requirements for a distributable learner corpus. The second step is to correct the text, which is carried out in the simplest possible way by text editing. During the editing, SVALA automatically maintains a parallel corpus with alignments between words in the learner source text and corrected text, while the annotator may repair inconsistent word alignments. Finally, the actual labelling of the corrections (the postulated errors) is performed. We describe the objectives, design and workflow of SVALA, and our plans for further development.

  • 2019. Adam Ek, Mats Wirén. Proceedings of the Digital Humanities in the Nordic Countries 4th Conference, 124-132

    This paper presents a supervised method for a novel task, namely, detecting elements of narration in passages of dialogue in prose fiction. The method achieves an F1-score of 80.8%, exceeding the best baseline by almost 33 percentage points. The purpose of the method is to enable a more fine-grained analysis of fictional dialogue than has previously been possible, and to provide a component for the further analysis of narrative structure in general.

  • 2018. Adam Ek (et al.). 11th edition of the Language Resources and Evaluation Conference

    This paper describes an approach to identifying speakers and addressees in dialogues extracted from literary fiction, along with a dataset annotated for speaker and addressee. The overall purpose of this is to provide annotation of dialogue interaction between characters in literary corpora in order to allow for enriched search facilities and construction of social networks from the corpora. To predict speakers and addressees in a dialogue, we use a sequence labeling approach applied to a given set of characters. We use features relating to the current dialogue, the preceding narrative, and the complete preceding context. The results indicate that even with a small amount of training data, it is possible to build a fairly accurate classifier for speaker and addressee identification across different authors, though the identification of addressees is the more difficult task.

Show all publications by Mats Wirén at Stockholm University

Last updated: February 17, 2020

