Context-sensitive multimodal emotion recognition from speech and facial expression using bidirectional LSTM modeling

Martin Wöllmer, Angeliki Metallinou, Florian Eyben, Björn Schuller, Shrikanth Narayanan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

161 Scopus citations

Abstract

In this paper, we apply a context-sensitive technique for multimodal emotion recognition based on feature-level fusion of acoustic and visual cues. We use bidirectional Long Short-Term Memory (BLSTM) networks which, unlike most other emotion recognition approaches, exploit long-range contextual information for modeling the evolution of emotion within a conversation. We focus on recognizing dimensional emotional labels, which enables us to classify both prototypical and non-prototypical emotional expressions contained in a large audiovisual database. Subject-independent experiments on various classification tasks reveal that the BLSTM network approach generally prevails over standard classification techniques such as Hidden Markov Models or Support Vector Machines, and achieves F1-measures of the order of 72%, 65%, and 55% for the discrimination of three clusters in emotional space and the distinction between three levels of valence and activation, respectively.

Original languageEnglish
Title of host publicationProceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
PublisherInternational Speech Communication Association
Pages2362-2365
Number of pages4
StatePublished - 2010

Publication series

NameProceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010

Keywords

  • Context modeling
  • Emotion recognition
  • Hidden Markov models
  • Long short-term memory
  • Multimodality

Fingerprint

Dive into the research topics of 'Context-sensitive multimodal emotion recognition from speech and facial expression using bidirectional LSTM modeling'. Together they form a unique fingerprint.

Cite this