TY - GEN
T1 - On the influence of phonetic content variation for acoustic emotion recognition
AU - Vlasenko, Bogdan
AU - Schuller, Björn
AU - Wendemuth, Andreas
AU - Rigoll, Gerhard
PY - 2008
Y1 - 2008
N2 - Acoustic Modeling in today's emotion recognition engines employs general models independent of the spoken phonetic content. This seems to work well enough given sufficient instances to cover for a broad variety of phonetic structures and emotions at the same time. However, data is usually sparse in the field and the question arises whether unit specific models as word emotion models could outperform the typical general models. In this respect this paper tries to answer the question how strongly acoustic emotion models depend on the textual and phonetic content. We investigate the influence on the turn and word level by use of state-of-the-art techniques for frame and word modeling on the well-known public Berlin Emotional Speech and Speech Under Simulated and Actual Stress databases. In the result it is clearly shown that the phonetic structure does strongly influence the accuracy of emotion recognition.
AB - Acoustic Modeling in today's emotion recognition engines employs general models independent of the spoken phonetic content. This seems to work well enough given sufficient instances to cover for a broad variety of phonetic structures and emotions at the same time. However, data is usually sparse in the field and the question arises whether unit specific models as word emotion models could outperform the typical general models. In this respect this paper tries to answer the question how strongly acoustic emotion models depend on the textual and phonetic content. We investigate the influence on the turn and word level by use of state-of-the-art techniques for frame and word modeling on the well-known public Berlin Emotional Speech and Speech Under Simulated and Actual Stress databases. In the result it is clearly shown that the phonetic structure does strongly influence the accuracy of emotion recognition.
UR - http://www.scopus.com/inward/record.url?scp=48249126771&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-69369-7_24
DO - 10.1007/978-3-540-69369-7_24
M3 - Conference contribution
AN - SCOPUS:48249126771
SN - 3540693688
SN - 9783540693680
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 217
EP - 220
BT - Perception in Multimodal Dialogue Systems - 4th IEEE Tutorial and Research Workshop on Perception and Interactive Technologies for Speech-Based Systems, PIT 2008, Proceedings
T2 - 4th IEEE Tutorial and Research Workshop on Perception and Interactive Technologies for Multimodal Dialogue Systems, PIT 2008
Y2 - 16 June 2008 through 18 June 2008
ER -