Domain Adapting Deep Reinforcement Learning for Real-world Speech Emotion Recognition

Thejan Rajapakshe, Rajib Rana, Sara Khalifa, Bjorn W. Schuller

Research output: Contribution to journalArticlepeer-review

Abstract

Speech-emotion recognition (SER) enables computers to engage with people in an emotionally intelligent way. The inability to adapt an existing model to a new domain is one of the significant limitations of SER methods. To overcome this challenge, domain adaptation techniques have been developed to transfer the knowledge learnt by a model across domains. Although existing domain adaptation techniques have improved the performance of SER models across domains, there is a need to improve their ability to adapt to real-world situations where models can self-tune while deployed. This paper presents a deep reinforcement learning-based strategy (RL-DA) for adapting a pre-trained SER model to a real-world setting by interacting with the environment and collecting continuous feedback. The proposed RL-DA technique is evaluated on SER tasks, including cross-corpus and cross-language domain adaptation scenarios. Our evaluation results show that RL-DA achieves significant improvements of 11% and 14% in testing accuracy over a fully supervised baseline for cross-corpus and cross-language scenarios, respectively, in the real-world setting. This technique also outperforms the baseline model's performance for both speaker independent and speaker dependent SER tasks.

Original languageEnglish
JournalIEEE Access
DOIs
StateAccepted/In press - 2024
Externally publishedYes

Keywords

  • Domain Adaptation
  • Reinforcement Learning
  • Speech Emotion Recognition

Fingerprint

Dive into the research topics of 'Domain Adapting Deep Reinforcement Learning for Real-world Speech Emotion Recognition'. Together they form a unique fingerprint.

Cite this