Analyzing Emotion Expression in Singing via Flow Glottograms, Long-Term-Average Spectra, and Expert Listener Evaluation

2021. Johan Sundberg, Gláucia Laís Salomão, Klaus R. Scherer. Journal of Voice 35 (1), 52-60

Artikel

Background

Acoustic aspects of emotional expressivity in speech have been analyzed extensively during recent decades. Emotional coloring is an important if not the most important property of sung performance, and therefore strictly controlled. Hence, emotional expressivity in singing may promote a deeper insight into vocal signaling of emotions. Furthermore, physiological voice source parameters can be assumed to facilitate the understanding of acoustical characteristics.

Method

Three highly experienced professional male singers sang scales on the vowel /ae/ or /a/ in 10 emotional colors (Neutral, Sadness, Tender, Calm, Joy, Contempt, Fear, Pride, Love, Arousal, and Anger). Sixteen voice experts classified the scales in a forced-choice listening test, and the result was compared with long-term-average spectrum (LTAS) parameters and with voice source parameters, derived from flow glottograms (FLOGG) that were obtained from inverse filtering the audio signal.

Results

On the basis of component analysis, the emotions could be grouped into four “families”, Anger-Contempt, Joy-Love-Pride, Calm-Tender-Neutral and Sad-Fear. Recognition of the intended emotion families by listeners reached accuracy levels far beyond chance level. For the LTAS and FLOGG parameters, vocal loudness had a paramount influence on all. Also after partialing out this factor, some significant correlations were found between FLOGG and LTAS parameters. These parameters could be sorted into groups that were associated with the emotion families.

Conclusions

(i) Both LTAS and FLOGG parameters varied significantly with the enactment intentions of the singers. (ii) Some aspects of the voice source are reflected in LTAS parameters. (iii) LTAS parameters affect listener judgment of the enacted emotions and the accuracy of the intended emotional coloring.

Läs mer om Analyzing Emotion Expression in Singing via Flow Glottograms, Long-Term-Average Spectra, and Expert Listener Evaluation

What does LTAS tell about the voice source?

2018. Johan Sundberg, Gláucia Laís Salomão, Klaus R. Scherer. 47th Annual Symposium: Care of the Professional Voice, 15-15

Konferens

Objective: The long-term-average spectrum, or LTAS has been extensively used in voice research. It provides an overall measure of voice characteristics allowing to derive a large number of parameters. A minimalistic set of parameters has been identified which offers the most essential properties [Eyben et al., 2015; 2016; Scherer et al., 2017]. LTAS analysis is typically applied to audio signals of running speech or continuous singing. It reflects the combination of formant frequency and voice source characteristics. Often, e.g. in clinical settings, it is relevant to distinguish between these two sources Voice source analysis can be performed by means of inverse filtering. The aim of the present work was to analyse the relationships between LTAS and voice source properties.

Method: Three internationally touring male singers sang scales in eleven different emotional colours. This material was analysed by inverse filtering as well as in terms of LTAS. The correlations between the averages across the scale tones of the flow glottogram parameters and minimalistic set of LTAS parameters were analysed.

Results/Conclusions: A strong negative correlation was found between spectral slope and the flow glottogram’s maximum flow declination rate MFDR, and a strong positive correlation between proportion of spectral energy below 1000Hz and H1-H2. Somewhat surprisingly, a strong negative correlation was found between equivalent sound level and the normalized and un-normalized amplitude quotients (the ratio between AC peak-to-peak amplitude of the flow glottogram and MFDR). Thus, these LTAS parameters seem particularly informative with respect to voice source characteristics.

Läs mer om What does LTAS tell about the voice source?

Expressão vocal de emoções [Vocal expression of emotions]: metáfora sonora, fala e canto [Sound metaphors, speech and singing]

2016. Gláucia Laís Salomão. Sonoridades [Sonorities], 31-43

Kapitel

The communication of emotions is crucial to social relationships and plays a fundamental role in maintaining the social order between people. In this chapter we are looking at the communication of emotions through two expressive modalities that make use of sound as a mean of communication, i.e. speech and singing. Throughout the text we argue in favor of the idea that the vocal expression of emotions reflects physiological aspects associated to the emotion itself that is expressed; that there are many similarities between the expressive patterns found in speech and in singing; and that the singing is expressive bacause it has traces of expressive patterns of speech.

Läs mer om Expressão vocal de emoções [Vocal expression of emotions]

Emotion in the singing voice—a deeper look at acoustic features in the light of automatic classification

2015. Florian Eyben (et al.). EURASIP Journal on Audio, Speech, and Music Processing

Artikel

We investigate the automatic recognition of emotions in the singing voice and study the worth and role of a variety of relevant acoustic parameters. The data set contains phrases and vocalises sung by eight renowned professional opera singers in ten different emotions and a neutral state. The states are mapped to ternary arousal and valence labels. We propose a small set of relevant acoustic features basing on our previous findings on the same data and compare it with a large-scale state-of-the-art feature set for paralinguistics recognition, the baseline feature set of the Interspeech 2013 Computational Paralinguistics ChallengE (ComParE). A feature importance analysis with respect to classification accuracy and correlation of features with the targets is provided in the paper. Results show that the classification performance with both feature sets is similar for arousal, while the ComParE set is superior for valence. Intra singer feature ranking criteria further improve the classification accuracy in a leave-one-singer-out cross validation significantly.

Läs mer om Emotion in the singing voice—a deeper look at acoustic features in the light of automatic classification

The Swedish MINT Project: modelling infant language acquisition from parten-child interaction

2015. Tove Gerholm (et al.).

Konferens

The MINT-project is a longitudinal study of verbal and nonverbal interaction between 73 Swedish children and their parents, recorded in lab environment from 3 months to 3 years of age. The overall goal of the project is to deepen our understanding of how language acquisition takes place in a multimodal and interactional framework.

Läs mer om The Swedish MINT Project

Gláucia Laís SalomãoForskare

Forskningsprojekt

Publikationer

Analyzing Emotion Expression in Singing via Flow Glottograms, Long-Term-Average Spectra, and Expert Listener Evaluation

What does LTAS tell about the voice source?

Expressão vocal de emoções [Vocal expression of emotions]: metáfora sonora, fala e canto [Sound metaphors, speech and singing]

Emotion in the singing voice—a deeper look at acoustic features in the light of automatic classification

The Swedish MINT Project: modelling infant language acquisition from parten-child interaction