Cross-language acoustic emotion recognition: An overview and some tendencies

Silvia Monica Feraru, Dagmar Schuller, Bjdrn Schuller

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

57 Scopus citations

Abstract

Automatic emotion recognition from speech has matured close to the point where it reaches broader commercial interest. One of the last major limiting factors is the ability to deal with multilingual inputs as will be given in a real-life operating system in many if not most cases. As in real-life scenarios speech is often used mixed across languages more experience will be needed in performance effects of cross-language recognition. In this contribution we first provide an overview on languages covered in the research on emotion and speech finding that only roughly two thirds of native speakers' languages are so far touched upon. We thus next shed light on mis-matched vs matched condition emotion recognition across a variety of languages. By intention, we include less researched languages of more distant language families such as Burmese, Romanian or Turkish. Binary arousal and valence mapping is employed in order to be able to train and test across databases that have originally been labelled in diverse categories. In the result - as one may expect - arousal recognition works considerably better across languages than valence, and cross-language recognition falls considerably behind within-language recognition. However, within-language family recognition seems to provide an 'emergency-solution' in case of missing language resources, and the observed notable differences depending on the combination of languages show a number of interesting effects.

Original languageEnglish
Title of host publication2015 International Conference on Affective Computing and Intelligent Interaction, ACII 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages125-131
Number of pages7
ISBN (Electronic)9781479999538
DOIs
StatePublished - 2 Dec 2015
Event2015 International Conference on Affective Computing and Intelligent Interaction, ACII 2015 - Xi'an, China
Duration: 21 Sep 201524 Sep 2015

Publication series

Name2015 International Conference on Affective Computing and Intelligent Interaction, ACII 2015

Conference

Conference2015 International Conference on Affective Computing and Intelligent Interaction, ACII 2015
Country/TerritoryChina
CityXi'an
Period21/09/1524/09/15

Keywords

  • Cross-Corpus
  • Multilinguality
  • Speech Emotion Recognition

Fingerprint

Dive into the research topics of 'Cross-language acoustic emotion recognition: An overview and some tendencies'. Together they form a unique fingerprint.

Cite this