Acoustic emotion recognition: A benchmark comparison of performances

Björn Schuller, Bogdan Vlasenko, Florian Eyben, Gerhard Rigoll, Andreas Wendemuth

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

223 Scopus citations

Abstract

In the light of the first challenge on emotion recognition from speech we provide the largest-to-date benchmark comparison under equal conditions on nine standard corpora in the field using the two pre-dominant paradigms: modeling on a frame-level by means of Hidden Markov Models and supra-segmental modeling by systematic feature brute-forcing. Investigated corpora are the ABC, AVIC, DES, EMO-DB, eNTERFACE, SAL, SmartKom, SUSAS, and VAM databases. To provide better comparability among sets, we additionally cluster each database's emotions into binary valence and arousal discrimination tasks. In the result large differences are found among corpora that mostly stem from naturalistic emotions and spontaneous speech vs. more prototypical events. Further, supra-segmental modeling proves significantly beneficial on average when several classes are addressed at a time.

Original languageEnglish
Title of host publicationProceedings of the 2009 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2009
Pages552-557
Number of pages6
DOIs
StatePublished - 2009
Event2009 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2009 - Merano, Italy
Duration: 13 Dec 200917 Dec 2009

Publication series

NameProceedings of the 2009 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2009

Conference

Conference2009 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2009
Country/TerritoryItaly
CityMerano
Period13/12/0917/12/09

Fingerprint

Dive into the research topics of 'Acoustic emotion recognition: A benchmark comparison of performances'. Together they form a unique fingerprint.

Cite this