Kristina Nilsson Björkenstam

Kristina Nilsson Björkenstam


View page in English
Arbetar vid Institutionen för lingvistik
Telefon 08-16 17 61
Besöksadress Universitetsvägen 10 C, plan 2-3
Rum C 366
Postadress Institutionen för lingvistik 106 91 Stockholm

Om mig

Forskare vid Avdelningen för datorlingvistik.

Studierektor för allmän språkvetenskap, fonetik och datorlingvistik på grund- och avancerad nivå. Samordnande studierektor för lingvistik och teckenspråk.





I urval från Stockholms universitets publikationsdatabas
  • 2019. Carla Wikse Barrow, Kristina Nilsson Björkenstam, Sofia Strömbergsson. Journal of Child Language 46 (2), 199-213

    This study aimed to investigate concerns of validity and reliability in subjective ratings of age-of-acquisition (AoA), through exploring characteristics of the individual rater. An additional aim was to validate the obtained AoA ratings against two corpora – one of child speech and one of adult speech – specifically exploring whether words over-represented in the child-speech corpus are rated with lower AoA than words characteristic of the adult-speech corpus. The results show that less than one-third of participating informants’ ratings are valid and reliable. However, individuals with high familiarity with preschool-aged children provide more valid and reliable ratings, compared to individuals who do not work with or have children of their own. The results further show a significant, age-adjacent difference in rated AoA for words from the two different corpora, thus strengthening their validity. The study provides AoA data, of high specificity, for 100 child-specific and 100 adult-specific Swedish words.

  • 2018. Paul Ibbotson, Rose M. Hartman, Kristina Nilsson Björkenstam. Language, Cognition and Neuroscience 33 (6), 1-15

    We present an open-access analytic tool, which allows researchers to simultaneously control for and combine language data from the child, the caregiver, multiple languages, and across multiple time points to make inferences about the social and cognitive factors driving the shape of language development. We demonstrate how the tool works in three domains of language learning and across six languages. The results demonstrate the usefulness of this approach as well as providing deeper insight into three areas of language production and acquisition: egocentric language use, the learnability of nouns versus verbs, and imageability. We have made the Frequency Filter tool freely available as an R-package for other researchers to use at

  • 2017. Sofia Strömbergsson (et al.). Proceedings of the 18th Annual Conference of the International Speech Communication Association (INTERSPEECH 2017), Stockholm: The International Speech Communication Association (ISCA), 2017., 2214-2217

    Child-directed spoken data is the ideal source of support for claims about children’s linguistic environments. However, phonological transcriptions of child-directed speech are scarce,compared to sources like adult-directed speech or text data. Acquiring reliable descriptions of children’s phonological environments from more readily accessible sources would mean considerable savings of time and money. The first step towards this goal is to quantify the reliability of descriptions derived from such secondary sources. We investigate how phonological distributions vary across different modalities (spoken vs. written), and across the age of the intended audience (children vs. adults). Using a previously unseen collection of Swedish adult- and child-directed spoken and written data, we combine lexicon look-up and grapheme-to-phonemeconversion to approximate phonological characteristics. The analysis shows distributional differences across datasets both for single phonemes and for longer phoneme sequences. Some of these are predictably attributed to lexical and contextual characteristics of text vs. speech.The generated phonological transcriptions are remarkably reliable. The differences in phonological distributions between child-directed speech and secondary sources highlight a need for compensatory measures when relying on written data or onadult-directed spoken data, and/or for continued collection ofactual child-directed speech in research on children’s language environments.

  • 2017. Kristina Nilsson Björkenstam, Gintaré Grigonyté. Språktidningen (2), 24-27
  • 2016. Kristina Nilsson Björkenstam, Mats Wirén, Robert Östling. The 54th Annual Meeting of the Association for Computational Linguistics, 82-90

    How do infants learn the meanings of their first words? This study investigates the informativeness and temporal dynamics of non-verbal cues that signal the speaker's referent in a model of early word–referent mapping. To measure the information provided by such cues, a supervised classifier is trained on information extracted from a multimodally annotated corpus of 18 videos of parent–child interaction with three children aged 7 to 33 months. Contradicting previous research, we find that gaze is the single most informative cue, and we show that this finding can be attributed to our fine-grained temporal annotation. We also find that offsetting the timing of the non-verbal cues reduces accuracy, especially if the offset is negative. This is in line with previous research, and suggests that synchrony between verbal and non-verbal cues is important if they are to be perceived as causally related.

  • 2014. Kristina Nilsson Björkenstam, Sofia Gustafson Capková, Mats Wirén. Strindberg on International Stages/Strindberg in Translation

    We have approached the works of August Strindberg from  a computational linguistic point of view, resulting in The Stockholm University Strindberg Corpus, consisting of seven of Strindberg's autobiographical works with linguistic annotation. The corpus is freely available for research. We use this corpus for three quantitative studies of Strindberg’s work: in the first, we describe the novels included in the corpus by keywords; in the second, we compare Strindberg’s use of emotionally charged words with selected prose of both his contemporaries and present-day authors; in the third, we explore the semantic prosody of KVINNA (“woman”) and MAN (“man”).

  • Artikel SUC-CORE
    2013. Kristina Nilsson Björkenstam. Northern European Journal of Language Technology (NEJLT) 3 (2), 19-39

    This paper describes SUC-CORE, a subset of the Stockholm Umeå Corpus and the Swedish Treebank annotated with noun phrase coreference. While most coreference annotated corpora consist of texts of similar types within related domains, SUC-CORE consists of both informative and imaginative prose and covers a wide range of literary genres and domains.This allows for exploration of coreference across different text types, but it also means that there are limited amounts of data within each type. Future work on coreference resolution for Swedish should include making more annotated data available for the research community.

Visa alla publikationer av Kristina Nilsson Björkenstam vid Stockholms universitet

Senast uppdaterad: 12 februari 2019

Bokmärk och dela Tipsa