Combining speech recognition and acoustic word emotion models for robust text-independent emotion recognition

Björn Schuller, Bogdan Vlasenko, Dejan Arsic, Gerhard Rigoll, Andreas Wendemuth

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

23 Scopus citations

Abstract

Recognition of emotion in speech usually uses acoustic models that ignore the spoken content. Likewise one general model per emotion is trained independent of the phonetic structure. Given sufficient data, this approach seemingly works well enough. Yet, this paper tries to answer the question whether acoustic emotion recognition strongly depends on phonetic content, and if models tailored for the spoken unit can lead to higher accuracies. We therefore investigate phoneme-, and word-models by use of a large prosodic, spectral, and voice quality feature space and Support Vector Machines (SVM). Experiments also take the necessity of ASR into account to select appropriate unitmodels. Test-runs on the well-known EMO-DB database facing speaker-independence demonstrate superiority of word emotion models over today's common general models provided sufficient occurrences in the training corpus.

Original languageEnglish
Title of host publication2008 IEEE International Conference on Multimedia and Expo, ICME 2008 - Proceedings
Pages1333-1336
Number of pages4
DOIs
StatePublished - 2008
Event2008 IEEE International Conference on Multimedia and Expo, ICME 2008 - Hannover, Germany
Duration: 23 Jun 200826 Jun 2008

Publication series

Name2008 IEEE International Conference on Multimedia and Expo, ICME 2008 - Proceedings

Conference

Conference2008 IEEE International Conference on Multimedia and Expo, ICME 2008
Country/TerritoryGermany
CityHannover
Period23/06/0826/06/08

Keywords

  • Acoustic modeling
  • Affective speech
  • Emotion recognition
  • Word models

Fingerprint

Dive into the research topics of 'Combining speech recognition and acoustic word emotion models for robust text-independent emotion recognition'. Together they form a unique fingerprint.

Cite this