Henrik Nordström


Visa sidan på svenska
Works at Department of Psychology
Telephone 08-16 48 04
Visiting address Frescati hagväg 9A
Room 213
Postal address Psykologiska institutionen 106 91 Stockholm


A selection from Stockholm University publication database
  • 2019. Henrik Nordström, Petri Laukka, Marc Pell.

    Emotional communication is an important part of social interaction because it gives individuals valuable information about the state of others, allowing them to adjust their behaviors and responses appropriately. When people use the voice to communicate, listeners do not only interpret the words that are said, the verbal content, but also the information contained in how the words are said, the nonverbal content. A large portion of the nonverbal content of the voice is thought to convey information about the emotional state of the speaker. The aim of this thesis was to study how humans communicate and interpret emotions via nonverbal aspects of the voice, and to describe these aspects in terms of acoustic parameters that allow listeners to interpret the emotional message.

    The thesis presents data from four studies investigating nonverbal communication of emotions from slightly different perspectives. In a yet unpublished study, the acoustic parameters suggested to communicate discrete emotions – based on theoretical predictions of how the voice may be influenced by emotional episodes – were compared with empirical data derived from listeners’ judgments of actors portraying a wide variety of emotions. Results largely corroborated the theoretical predictions suggesting that previous research has come far in explaining the mechanisms allowing listeners to infer emotions from the nonverbal aspects of speech. However, potentially important deviations were also observed. These deviations may be crucial to our understanding of how emotions are communicated in speech, and highlight the need to refine theoretical predictions to better describe the acoustic features that listeners use to understand emotional voices.

    In the first of the three published studies, Study 1, the common sense notion that we are quick to hear the emotional state of a speaker was investigated and compared with the recognition of emotional expressivity in music. Results showed that listeners needed very little acoustic information to recognize emotions in both modes of communication. These findings suggest that low-level acoustic features that are available to listeners in the first tenths of a second carry much of the emotional message and that these features may be used in both speech and music.

    By investigating listeners recognition of vocal bursts – the kind of sounds people make when they are not speaking – results from Study 2 showed that listeners can recognize several emotional expressions across cultures, including emotions that are often difficult to recognize from speech. The study thus suggests that the voice is an even more versatile means for emotional communication than previously thought.

    Study 3 also investigated emotional communication in a cross-cultural setting. However, instead of studying emotion recognition in terms of discrete categories, this study investigated whether nonverbal aspects of the voice can carry information about how the speaker evaluated the situation that elicited the emotion. Results showed that listeners were able to infer several aspects about the situation, which suggests that nonverbal expressions may have a symbolic meaning comprising several dimensions other than valence and arousal that can be understood across cultures.

    Taken together, the results of this thesis suggest that humans use nonverbal manipulations of the voice to communicate emotions and that these manipulations can be understood quickly and accurately by listeners both within and across cultures. Although decades of research has investigated how this communication occurs, the acoustic parameters allowing listeners to interpret emotions are still elusive. The data from the four studies in this thesis, the methods used, and the acoustic analyses performed shed new light on this process. Future research in the field may benefit from a more standardized approach across studies, both when it comes to acoustic analysis and experimental design. This would facilitate comparisons of findings between different studies and allow for a more cumulative science within the field of emotional communication in the human voice.

  • 2019. Henrik Nordström, Petri Laukka. Journal of the Acoustical Society of America 145 (5), 3058-3074

    The auditory gating paradigm was adopted to study how much acoustic information is needed to recognize emotions from speech prosody and music performances. In Study 1, brief utterances conveying ten emotions were segmented into temporally fine-grained gates and presented to listeners, whereas Study 2 instead used musically expressed emotions. Emotion recognition accuracy increased with increasing gate duration and generally stabilized after a certain duration, with different trajectories for different emotions. Above-chance accuracy was observed for <= 100 ms stimuli for anger, happiness, neutral, and sadness, and for <= 250 ms stimuli for most other emotions, for both speech and music. This suggests that emotion recognition is a fast process that allows discrimination of several emotions based on low-level physical characteristics. The emotion identification points, which reflect the amount of information required for stable recognition, were shortest for anger and happiness for both speech and music, but recognition took longer to stabilize for music vs speech. This, in turn, suggests that acoustic cues that develop over time also play a role for emotion inferences (especially for music). Finally, acoustic cue patterns were positively correlated between speech and music, suggesting a shared acoustic code for expressing emotions. (C) 2019 Acoustical Society of America.

  • 2019. Henrik Nordström, Petri Laukka.

    This study aimed to test how well acoustic features suggested by the literature could predict listener behavior in a forced-choice vocal emotion recognition task. Fourteen actors, ranging from amateurs to professionals, were instructed to vocally portray 13 emotion expressions (Anger, Contempt, Disgust, Fear, Happiness, Interest, Lust, Pride, Relief, Sadness, Serenity, Shame, and Tenderness). An “optimal pattern”-index was calculated for each recorded portrayal reflecting how well it mirrored an emotionspecific “optimal pattern” of acoustic features inferred from the literature. Listeners (N = 102) judged the portrayals in a forced-choice vocal emotion recognition task. Each listener judged a subset of the 756 portrayals resulting in an average of 16.8 (SD = 2.1) judgments per portrayal. The “optimal pattern”-index was then used to predict the proportion of listeners who selected each emotion label for each portrayal. Results showed that the “optimal pattern”-index predicted perceived emotion for anger, happiness, and sadness, but not for any of the other emotions. This suggests that the acoustic features conveying most of the emotions included in the current study need to be further explored. To this aim, we present descriptive acoustic data for all portrayals for which a majority (> 50%) of the listeners selected the same emotion label. These descriptive results suggest new acoustic patterns that, if replicated, might lead to more stable predictions about the acoustic features underlying listener judgments of specific emotions in the voice.

  • 2017. Henrik Nordström (et al.). Royal Society Open Science 4 (11)

    This study explored the perception of emotion appraisal dimensions on the basis of speech prosody in a cross-cultural setting. Professional actors from Australia and India vocally portrayed different emotions (anger, fear, happiness, pride, relief, sadness, serenity and shame) by enacting emotion-eliciting situations. In a balanced design, participants from Australia and India then inferred aspects of the emotion-eliciting situation from the vocal expressions, described in terms of appraisal dimensions (novelty, intrinsic pleasantness, goal conduciveness, urgency, power and norm compatibility). Bayesian analyses showed that the perceived appraisal profiles for the vocally expressed emotions were generally consistent with predictions based on appraisal theories. Few group differences emerged, which suggests that the perceived appraisal profiles are largely universal. However, some differences between Australian and Indian participants were also evident, mainly for ratings of norm compatibility. The appraisal ratings were further correlated with a variety of acoustic measures in exploratory analyses, and inspection of the acoustic profiles suggested similarity across groups. In summary, results showed that listeners may infer several aspects of emotion-eliciting situations from the non-verbal aspects of a speaker's voice. These appraisal inferences also seem to be relatively independent of the cultural background of the listener and the speaker.

  • 2014. Jesper J. Alvarsson (et al.). Journal of the Acoustical Society of America 135 (6), 3455-3462

    Studies of effects on speech intelligibility from aircraft noise in outdoor places are currently lacking. To explore these effects, first-order ambisonic recordings of aircraft noise were reproduced outdoors in a pergola. The average background level was 47 dB L-Aeq. Lists of phonetically balanced words (L-ASmax,L- word = 54 dB) were reproduced simultaneously with aircraft passage noise (L-ASmax,L- noise = 72-84 dB). Twenty individually tested listeners wrote down each presented word while seated in the pergola. The main results were (i) aircraft noise negatively affects speech intelligibility at sound pressure levels that exceed those of the speech sound (signal-to-noise ratio, S/N < 0), and (ii) the simple A-weighted S/N ratio was nearly as good an indicator of speech intelligibility as were two more advanced models, the Speech Intelligibility Index and Glasberg and Moore's [J. Audio Eng. Soc. 53, 906-918 (2005)] partial loudness model. This suggests that any of these indicators is applicable for predicting effects of aircraft noise on speech intelligibility outdoors.

  • 2013. Petri Laukka (et al.). Frontiers in Psychology 4, 353

    Which emotions are associated with universally recognized non-verbal signals? We address this issue by examining how reliably non-linguistic vocalizations (affect bursts) can convey emotions across cultures. Actors from India, Kenya, Singapore, and USA were instructed to produce vocalizations that would convey nine positive and nine negative emotions to listeners. The vocalizations were judged by Swedish listeners using a within-valence forced-choice procedure, where positive and negative emotions were judged in separate experiments. Results showed that listeners could recognize a wide range of positive and negative emotions with accuracy above chance. For positive emotions, we observed the highest recognition rates for relief, followed by lust, interest, serenity and positive surprise, with affection and pride receiving the lowest recognition rates. Anger, disgust, fear, sadness, and negative surprise received the highest recognition rates for negative emotions, with the lowest rates observed for guilt and shame. By way of summary, results showed that the voice can reveal both basic emotions and several positive emotions other than happiness across cultures, but self-conscious emotions such as guilt, pride, and shame seem not to be well recognized from non-linguistic vocalizations.

  • 2012. Henrik Nordström, Stefan Wiens. BMC neuroscience (Online) 13, 49

    Background: In research on event-related potentials (ERP) to emotional pictures, greater attention to emotional than neutral stimuli (i.e., motivated attention) is commonly indexed by two difference waves between emotional and neutral stimuli: the early posterior negativity (EPN) and the late positive potential (LPP). Evidence suggests that if attention is directed away from the pictures, then the emotional effects on EPN and LPP are eliminated. However, a few studies have found residual, emotional effects on EPN and LPP. In these studies, pictures were shown at fixation, and picture composition was that of simple figures rather than that of complex scenes. Because figures elicit larger LPP than do scenes, figures might capture and hold attention more strongly than do scenes. Here, we showed negative and neutral pictures of figures and scenes and tested first, whether emotional effects are larger to figures than scenes for both EPN and LPP, and second, whether emotional effects on EPN and LPP are reduced less for unattended figures than scenes.

    Results: Emotional effects on EPN and LPP were larger for figures than scenes. When pictures were unattended, emotional effects on EPN increased for scenes but tended to decrease for figures, whereas emotional effects on LPP decreased similarly for figures and scenes.

    Conclusions: Emotional effects on EPN and LPP were larger for figures than scenes, but these effects did not resist manipulations of attention more strongly for figures than scenes. These findings imply that the emotional content captures attention more strongly for figures than scenes, but that the emotional content does not hold attention more strongly for figures than scenes.

Show all publications by Henrik Nordström at Stockholm University

Last updated: May 8, 2020

Bookmark and share Tell a friend