Stockholm university

Julia UddénAssociate Professor

Research projects

Publications

A selection from Stockholm University publication database

  • Investigating Conversational Dynamics in Human-Robot Interaction with fMRI

    2023. Torubarova Ekaterina (et al.). Proceedings of the 45th Annual Conference of the Cognitive Science

    Conference

    We investigated how verbal communication with a robot differs from talking to a human in terms of brain activity by analysing an open-source fMRI dataset. We focused on modeling conversational dynamics rather than conversation as a whole, by analysing fine-grained events, in particular turn initiation. The results indicate that turn initiation in a conversation with a human involves higher activation in auditory and visual cortex than turn initiation with a robot. Conversely, listening to the robot showed higher engagement of auditory cortex than listening to a human. We suggest that verbal and non-verbal turn-taking cues provided by the human agent engage more cognitive processing for picking up the turn. On the other hand, listening to a robot agent requires more processing than listening to a human. Both findings suggest that the accurate simulation of appropriate turn-taking cues and behaviors will help robots to establish more natural conversation dynamics and that the use of brain imaging can provide valuable objective measurements for assessing user states in human-robot interaction.

    Read more about Investigating Conversational Dynamics in Human-Robot Interaction with fMRI
  • Phonotactics and syntax: investigating functional specialisation during structured sequence processing

    2023. Friederike Seyfried, Julia Uddén. Language, Cognition and Neuroscience 38 (3), 346-358

    Article

    Frontal lobe organisation displays a functional gradient, with overarching processing goals located in parts anterior to more subordinate goals, processed more posteriorly. Functional specialisation for syntax and phonology within language relevant areas has been supported by meta-analyses and reviews, but never directly tested experimentally. We tested for organised functional specialisation by manipulating syntactic case and phonotactics, creating violations at the end of otherwise matched and predictable sentences. Both violations led to increased activation in expected language regions. We observe the clearest signs of a functional gradient for language processing in the medial frontal cortex, where syntactic violations activated a more anterior portion compared to the phonotactic violations. A large overlap of syntactic and phonotactic processing in the left inferior frontal gyrus (LIFG) supports the view that general structured sequence processes are located in this area. These findings are relevant for understanding how sentence processing is implemented in hierarchically organised processing steps in the frontal lobe.

    Read more about Phonotactics and syntax
  • Why the GPT task of predicting the next word does not suffice to describe human language production: A conversational fMRI-study

    2023. Caroline Arvidsson, Johanna Sundström, Julia Uddén. Program Pdf of The 15th Annual Meeting of the Society for the Neurobiology of Language

    Conference

    Interest is surging around the ”next-word-predictability” task that allowed large language models to reach their current capacity. It is sometimes claimed that prediction is enough to model language production. We set out to study predictability in an interactive setting. The current fMRI study used the information-theoretic measure of surprisal – the negative log-probability of a word occurring given the preceding linguistic context, estimated by a pre-trained language model (GPT-2). Surprisal has been shown to correlate with bottom-up processing located in the bilateral middle and superior temporal gyri (MTG/STG) during narrative comprehension (Willems et al., 2016). Still, surprisal has never been used to investigate conversational comprehension or any kind of language production. We hypothesized that previous results on surprisal in narrative comprehension would be replicated with conversational comprehension and that next-word- predictability would not encompass language production processes. We utilized a publicly available fMRI dataset in which participants (N=24) engaged in unscripted conversations (12 min/participant) via an audio- video link with a confederate outside the scanner. The conversational events Production, Comprehension, and Silence were modeled in a whole-brain analysis. Two parametric modulations of production and comprehension were added: (1) log-transformed context-independent word frequency (control regressor) and (2) surprisal. Production-surprisal and Comprehension-surprisal were respectively contrasted against the implicit baseline. These contrasts were compared with the contrasts Production and Comprehension vs implicit baseline. If surprisal merely indexed part of the activity in the latter, broader contrasts, this provides a handle on production and comprehension processes beyond next-word-predictability. For surprisal in conversational production, we observed statistically signi�cant clusters in the left inferior frontal gyrus (LIFG), the medial frontal gyrus, and the motor cortex. Importantly, Production vs implicit baseline showed bilateral STG activation while STG was not parametrically modulated by surprisal. Moreover, the bilateral MTG/STG were the only clusters active for Comprehension vs implicit baseline and they were also modulated by surprisal. For comprehension, we thus replicated the previous narrative comprehension study (Willems et al.,2016), showing that unpredictable words activate the bilateral MTG/STG also in conversational settings. Next- word-predictability is thus so far a good model for conversational comprehension. For production, however, the next-word-predictability task helped to hone in on what is sometimes considered core production machinery in LIFG. Several functional interpretations of the STG recruitment during production are possible (such as monitoring for speech errors), but the current results point in the direction of two important conclusions: (1) a functional division of the frontal and temporal cortices during production, where the frontal component is prediction-related, and (2) that language processing during production is more than prediction, at least at the word-level. We provide a functional handle on such extra-predictive processes.

    Read more about Why the GPT task of predicting the next word does not suffice to describe human language production
  • Individual Differences in Indirect Speech Act Processing Found Outside the Language Network

    2022. Katarina Bendtz (et al.). Neurobiology of Language 3 (2), 287-317

    Article

    Face-to-face communication requires skills that go beyond core language abilities. In dialogue, we routinely make inferences beyond the literal meaning of utterances and distinguish between different speech acts based on, e.g., contextual cues. It is, however, not known whether such communicative skills potentially overlap with core language skills or other capacities, such as theory of mind (ToM). In this functional magnetic resonance imaging (fMRI) study we investigate these questions by capitalizing on individual variation in pragmatic skills in the general population. Based on behavioral data from 199 participants, we selected participants with higher vs. lower pragmatic skills for the fMRI study (N = 57). In the scanner, participants listened to dialogues including a direct or an indirect target utterance. The paradigm allowed participants at the whole group level to (passively) distinguish indirect from direct speech acts, as evidenced by a robust activity difference between these speech acts in an extended language network including ToM areas. Individual differences in pragmatic skills modulated activation in two additional regions outside the core language regions (one cluster in the left lateral parietal cortex and intraparietal sulcus and one in the precuneus). The behavioral results indicate segregation of pragmatic skill from core language and ToM. In conclusion, contextualized and multimodal communication requires a set of interrelated pragmatic processes that are neurocognitively segregated: (1) from core language and (2) partly from ToM.

    Read more about Individual Differences in Indirect Speech Act Processing Found Outside the Language Network
  • Supramodal Sentence Processing in the Human Brain: fMRI Evidence for the Influence of Syntactic Complexity in More Than 200 Participants

    2022. Julia Uddén (et al.). Neurobiology of Language 3 (4), 575-598

    Article

    This study investigated two questions. One is: To what degree is sentence processing beyond single words independent of the input modality (speech vs. reading)? The second question is: Which parts of the network recruited by both modalities is sensitive to syntactic complexity? These questions were investigated by having more than 200 participants read or listen to well-formed sentences or series of unconnected words. A largely left-hemisphere frontotemporoparietal network was found to be supramodal in nature, i.e., independent of input modality. In addition, the left inferior frontal gyrus (LIFG) and the left posterior middle temporal gyrus (LpMTG) were most clearly associated with left-branching complexity. The left anterior temporal lobe showed the greatest sensitivity to sentences that differed in right-branching complexity. Moreover, activity in LIFG and LpMTG increased from sentence onset to end, in parallel with an increase of the left-branching complexity. While LIFG, bilateral anterior temporal lobe, posterior MTG, and left inferior parietal lobe all contribute to the supramodal unification processes, the results suggest that these regions differ in their respective contributions to syntactic complexity related processing. The consequences of these findings for neurobiological models of language processing are discussed.

    Read more about Supramodal Sentence Processing in the Human Brain
  • When did you stop speaking to yourself? Age-related differences in adolescents’ world knowledge-based audience design

    2022. Caroline Arvidsson, David Pagmar, Julia Uddén. Royal Society Open Science 9 (11)

    Article

    The ability to adapt utterances to the world knowledge of one’s addressee is undeniably ubiquitous in human social cognition, but its development and association with other cognitive mechanisms during adolescence have not been studied. In an online production task, we measured the ability of children entering adolescence (ages 11–12, M= 11.8, 𝑁=29,17girlsN=29, 17 girls) and adolescents (ages 15–16, M = 15.9, 𝑁=29,17girlsN=29, 17 girls) to tailor referential expressions in accordance with the inferred world knowledge of their addressee—an ability we refer to as world knowledge-based audience design (AD). A post-test survey showed that both age groups held similar assumptions about the addressees’ knowledge of referents, but the younger age group did not consistently adapt their utterances in accordance with these assumptions during online production, resulting in a significantly improved AD behaviour across age groups. We also investigated the reliance of AD on executive functions (EF). Executive functioning (as reflected by performance on the Wisconsin card sorting task) increased significantly with age, but did not explain the age-related increase in AD performance. We thus provide evidence in support of an adolescent development of world knowledge-based AD over and above development of EF.

    Read more about When did you stop speaking to yourself? Age-related differences in adolescents’ world knowledge-based audience design
  • Audience design and frame of reference in adolescents' reference production

    2021. Caroline Arvidsson, David Pagmar, Julia Uddén. Abstracts, 1519-1519

    Conference

    When participating in dialogue, speakers design their utterances to accommodate the individual needs of listeners (Bentz, et al., in prep). This feature is known as audience design (Clark & Murphy, 1982). Although audience design is central to conventional conversation, it is not known at which age speakers begin taking into account the world knowledge/frame of reference of their interlocutors. Indications from recent studies suggest that albeit preschool and first grade children engage in basic forms of perspective taking (Nadig & Sedivy, 2002), they fail to adapt their utterances in accordance with listener-specific needs in reference production (Pagmar, et al., in prep). Adult participants do however adapt their utterances, and individual differences in the adult population were not dependent on cognitive control function (Bentz, et al., in prep). The dependence on cognitive control function, e.g. switching, may be hypothesized to be greater in children. The current study aims to test the referential production of two age groups; early and mid adolescents (11;0-12;11 and 15;0-16;11), with the purpose of tracing the development of the ability to use information regarding listener-perspective during on-line referential production, and test its relation to cognitive control. The paradigm builds further on the well-established Director’s task but does not require the participants to take the visual perspective of the listener. Instead, participants are presented with a set of pictures portraying referents well-known to them, e.g. popular cartoon characters, hosts of children’s tv-shows, etc. Knowledge of the referents are controlled through post-test surveys. Furthermore, they are asked to direct listeners of two distinct groups, small children and elders, into choosing the target referent. Participants who take the frame of reference of addressees into consideration are expected to adopt different strategies when addressing the different groups, i.e., increase informativeness when denoting referents assumed to be unknown to the listener vs using less informative referential expressions (such as proper names) when denoting referents judged to be known to the listener. Cognitive control/executive function is assessed using the Wisconsin card sorting task. Results are discussed in terms of cognitive costs of switching strategies and the Gricean maxim of quantity.

    Read more about Audience design and frame of reference in adolescents' reference production
  • Hierarchical Structure in Sequence Processing

    2020. Julia Uddén (et al.). Topics in Cognitive Science 12 (3), 910-924

    Article

    In many domains of human cognition, hierarchically structured representations are thought to play a key role. In this paper, we start with some foundational definitions of key phenomena like “sequence” and “hierarchy," and then outline potential signatures of hierarchical structure that can be observed in behavioral and neuroimaging data. Appropriate behavioral methods include classic ones from psycholinguistics along with some from the more recent artificial grammar learning and sentence processing literature. We then turn to neuroimaging evidence for hierarchical structure with a focus on the functional MRI literature. We conclude that, although a broad consensus exists about a role for a neural circuit incorporating the inferior frontal gyrus, the superior temporal sulcus, and the arcuate fasciculus, considerable uncertainty remains about the precise computational function(s) of this circuitry. An explicit theoretical framework, combined with an empirical approach focusing on distinguishing between plausible alternative hypotheses, will be necessary for further progress.

    Read more about Hierarchical Structure in Sequence Processing

Show all publications by Julia Uddén at Stockholm University