Multi-task deep neural network with shared hidden layers: Breaking down the wall between emotion representations

Yue Zhang, Yifan Liu, Felix Weninger, Bjorn Schuller

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

50 Scopus citations

Abstract

Emotion representations are psychological constructs for modelling, analysing, and recognising emotion, being one essential element of affect. Due to its complexity, the boundaries between different emotion concepts are often fuzzy, which is also reflected in the diversification of emotion databases, and their inconsistent target labels. When facing data scarcity as an ever present issue for acoustic emotion recognition, the straightforward method to jointly use the existing data resources is to map various emotion labels onto one common dimensional space; this, however, comes with considerable information loss. To solve the dilemma of data aggregation whilst efficiently exploiting the emotion labels in terms of their original meaning and interrelations, we advocate the usage of multi-task deep neural networks with shared hidden layers (MT-SHL-DNN), in which the feature transformations are shared across different emotion representations, while the output layers are separately associated with each emotion database. On nine frequently used emotional speech corpora and two different acoustic feature sets, we demonstrate that the MT-SHL-DNN method outperforms the single-task DNNs trained with only one emotion representation.

Original languageEnglish
Title of host publication2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages4990-4994
Number of pages5
ISBN (Electronic)9781509041176
DOIs
StatePublished - 16 Jun 2017
Externally publishedYes
Event2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - New Orleans, United States
Duration: 5 Mar 20179 Mar 2017

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017
Country/TerritoryUnited States
CityNew Orleans
Period5/03/179/03/17

Keywords

  • Affective Computing
  • Deep Neural Networks
  • Emotion Recognition
  • Multi-task Learning

Fingerprint

Dive into the research topics of 'Multi-task deep neural network with shared hidden layers: Breaking down the wall between emotion representations'. Together they form a unique fingerprint.

Cite this