TY - CHAP
T1 - Computational analysis of vocal expression of affect
T2 - Trends and challenges
AU - Scherer, Klaus
AU - Schüller, Björn
AU - Elkins, Aaron
N1 - Publisher Copyright:
© Judee K. Burgoon, Nadia Magnenat-Thalmann, Maja Pantic and Alessandro Vinciarelli 2017.
PY - 2017/1/1
Y1 - 2017/1/1
N2 - In this chapter we want to first provide a short introduction into the “classic” audio features used in this field and methods leading to the automatic recognition of human emotion as reflected in the voice. From there, we want to focus on the main trends leading up to the main challenges for future research. It has to be stated that a line is difficult to draw here - what are contemporary trends and where does “future” start. Further, several of the named trends and challenges are not limited to the analysis of speech, but hold for many if not all modalities. We focus on examples and references in the speech analysis domain. “Classic Features”: Perceptual and Acoustic Measures Systematic treatises on the importance of emotional expression in speech communication and its powerful impact on the listener can be found throughout history. Early Greek and Roman manuals on rhetoric (e.g., by Aristotle, Cicero, Quintilian) suggested concrete strategies for making speech emotionally expressive. Evolutionary theorists, such as Spencer, Bell, and Darwin, highlighted the social functions of emotional expression in speech and music. The empirical investigation of the effect of emotion on the voice started with psychiatrists trying to diagnose emotional disturbances and early radio researchers concerned with the communication of speaker attributes and states, using the newly developed methods of electroacoustic analysis via vocal cues in speech. Systematic research programs started in the 1960s when psychiatrists renewed their interest in diagnosing affective states, nonverbal communication researchers explored the capacity of different bodily channels to carry signals of emotion, emotion psychologists charted the expression of emotion in different modalities, linguists and particularly phoneticians discovered the importance of pragmatic information, all making use of ever more sophisticated technology to study the effects of emotion on the voice (see Scherer, 2003, for further details). While much of the relevant research has exclusively focused on the recognition of vocally expressed emotions by naive listeners, research on the production of emotional speech has used the extraction of acoustic parameters from the speech signal as a method to understand the patterning of the vocal expression of different emotions.
AB - In this chapter we want to first provide a short introduction into the “classic” audio features used in this field and methods leading to the automatic recognition of human emotion as reflected in the voice. From there, we want to focus on the main trends leading up to the main challenges for future research. It has to be stated that a line is difficult to draw here - what are contemporary trends and where does “future” start. Further, several of the named trends and challenges are not limited to the analysis of speech, but hold for many if not all modalities. We focus on examples and references in the speech analysis domain. “Classic Features”: Perceptual and Acoustic Measures Systematic treatises on the importance of emotional expression in speech communication and its powerful impact on the listener can be found throughout history. Early Greek and Roman manuals on rhetoric (e.g., by Aristotle, Cicero, Quintilian) suggested concrete strategies for making speech emotionally expressive. Evolutionary theorists, such as Spencer, Bell, and Darwin, highlighted the social functions of emotional expression in speech and music. The empirical investigation of the effect of emotion on the voice started with psychiatrists trying to diagnose emotional disturbances and early radio researchers concerned with the communication of speaker attributes and states, using the newly developed methods of electroacoustic analysis via vocal cues in speech. Systematic research programs started in the 1960s when psychiatrists renewed their interest in diagnosing affective states, nonverbal communication researchers explored the capacity of different bodily channels to carry signals of emotion, emotion psychologists charted the expression of emotion in different modalities, linguists and particularly phoneticians discovered the importance of pragmatic information, all making use of ever more sophisticated technology to study the effects of emotion on the voice (see Scherer, 2003, for further details). While much of the relevant research has exclusively focused on the recognition of vocally expressed emotions by naive listeners, research on the production of emotional speech has used the extraction of acoustic parameters from the speech signal as a method to understand the patterning of the vocal expression of different emotions.
UR - http://www.scopus.com/inward/record.url?scp=85048187383&partnerID=8YFLogxK
U2 - 10.1017/9781316676202.006
DO - 10.1017/9781316676202.006
M3 - Chapter
AN - SCOPUS:85048187383
SN - 1107161266
SN - 9781107161269
SP - 56
EP - 68
BT - Social Signal Processing
PB - Cambridge University Press
ER -