Automatically estimating emotion in music with deep long-short term memory recurrent neural networks

Eduardo Coutinho, George Trigeorgis, Stefanos Zafeiriou, Björn Schuller

Publikation: Beitrag in FachzeitschriftKonferenzartikelBegutachtung

14 Zitate (Scopus)

Abstract

In this paper we describe our approach for the MediaEval's "Emotion in Music" task. Our method consists of deep Long-Short Term Memory Recurrent Neural Networks (LSTM-RNN) for dynamic Arousal and Valence regression, using acoustic and psychoacoustic features extracted from the songs that have been previously proven as effective for emotion prediction in music. Results on the challenge test demonstrate an excellent performance for Arousal estimation (r = 0.613 ± 0.278), but not for Valence (r = 0.026 ± 0.500). Issues regarding the quality of the test set annotations' reliability and distributions are indicated as plausible justifications for these results. By using a subset of the development set that was left out for performance estimation, we could determine that the performance of our approach may be underestimated for Valence (Arousal: r = 0.596 ± 0.386; Valence: r = 0.458 ± 0.551).

OriginalspracheEnglisch
FachzeitschriftCEUR Workshop Proceedings
Jahrgang1436
PublikationsstatusVeröffentlicht - 2015
Extern publiziertJa
VeranstaltungMultimedia Benchmark Workshop, MediaEval 2015 - Wurzen, Deutschland
Dauer: 14 Sept. 201515 Sept. 2015

Fingerprint

Untersuchen Sie die Forschungsthemen von „Automatically estimating emotion in music with deep long-short term memory recurrent neural networks“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren