Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge

Björn Schuller, Anton Batliner, Stefan Steidl, Dino Seppi

Research output: Contribution to journalArticlepeer-review

587 Scopus citations

Abstract

More than a decade has passed since research on automatic recognition of emotion from speech has become a new field of research in line with its 'big brothers' speech and speaker recognition. This article attempts to provide a short overview on where we are today, how we got there and what this can reveal us on where to go next and how we could arrive there. In a first part, we address the basic phenomenon reflecting the last fifteen years, commenting on databases, modelling and annotation, the unit of analysis and prototypicality. We then shift to automatic processing including discussions on features, classification, robustness, evaluation, and implementation and system integration. From there we go to the first comparative challenge on emotion recognition from speech - the INTERSPEECH 2009 Emotion Challenge, organised by (part of) the authors, including the description of the Challenge's database, Sub-Challenges, participants and their approaches, the winners, and the fusion of results to the actual learnt lessons before we finally address the ever-lasting problems and future promising attempts.

Original languageEnglish
Pages (from-to)1062-1087
Number of pages26
JournalSpeech Communication
Volume53
Issue number9-10
DOIs
StatePublished - Nov 2011

Keywords

  • Adaptation
  • Affect
  • Automatic classification
  • Emotion
  • Evaluation
  • Feature selection
  • Feature types
  • Noise robustness
  • Standardisation
  • Usability

Fingerprint

Dive into the research topics of 'Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge'. Together they form a unique fingerprint.

Cite this