Exploring Perception Uncertainty for Emotion Recognition in Dyadic Conversation and Music Listening

Jing Han, Zixing Zhang, Zhao Ren, Björn Schuller

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

Predicting emotions automatically is an active field of research in affective computing. Considering the property of the individual’s subjectivity, the label of an emotional instance is usually created based on opinions from multiple annotators. That is, the labelled instance is often accompanied with the corresponding inter-rater disagreement information, which we call here the perception uncertainty. Such uncertainty information, as shown in previous studies, can provide supplementary information for better recognition performance in such a subjective task. In this paper, we propose a multi-task learning framework to leverage the knowledge of perception uncertainty to ameliorate the prediction performance. In particular, in our novel framework, the perception uncertainty is exploited in an explicit manner to manipulate an initial prediction dynamically, in contrast to merely estimating the emotional state and perception uncertainty simultaneously, as done in a conventional multi-task learning framework. To evaluate the feasibility and effectiveness of the proposed method, we perform extensive experiments for time- and value-continuous emotion predictions in audiovisual conversation and music listening scenarios. Compared with other state-of-the-art approaches, our approach yields remarkable performance improvements in both datasets. The obtained results indicate that integrating the perception uncertainty information can enhance the learning process.

Original languageEnglish
Pages (from-to)231-240
Number of pages10
JournalCognitive Computation
Volume13
Issue number2
DOIs
StatePublished - Mar 2021
Externally publishedYes

Keywords

  • Dynamic learning
  • Emotion prediction
  • Multi-task learning
  • Perception uncertainty modelling

Fingerprint

Dive into the research topics of 'Exploring Perception Uncertainty for Emotion Recognition in Dyadic Conversation and Music Listening'. Together they form a unique fingerprint.

Cite this