Exploring nonnegative matrix factorization for audio classification: Application to speaker recognition

Cyril Joder, Björn Schuller

Publikation: Beitrag in Buch/Bericht/KonferenzbandKonferenzbeitragBegutachtung

Abstract

In this paper, we test the use of Nonnegative Matrix Factorization (NMF) for feature extraction in the context of audio classification. NMF calculates a decomposition of the spectrogram into nonnegative factors and has been successfully applied to audio source separation. Thus, it has the potential to be robust to noise disturbances when used for feature calculation. We then introduce two feature sets directly derived from the NMF decomposition. Experiments performed on an 8-class speaker recognition task with Support Vector Machines show that the proposed representations convey complementary information to the baseline MFCC features. Indeed, the use of only the NMF-based descriptors lead to similar results as the reference features, and the combination of these representations yields a significant improvement of the obtained accuracy.

OriginalspracheEnglisch
TitelSprachkommunikation - 10. ITG-Fachtagung
Herausgeber (Verlag)VDE VERLAG GMBH
Seiten183-186
Seitenumfang4
ISBN (elektronisch)9783800734559
PublikationsstatusVeröffentlicht - 2020
Veranstaltung10. ITG-Fachtagung Sprachkommunikation - 10th ITG Conference on Speech Communication - Braunschweig, Deutschland
Dauer: 26 Sept. 201228 Sept. 2012

Publikationsreihe

NameSprachkommunikation - 10. ITG-Fachtagung

Konferenz

Konferenz10. ITG-Fachtagung Sprachkommunikation - 10th ITG Conference on Speech Communication
Land/GebietDeutschland
OrtBraunschweig
Zeitraum26/09/1228/09/12

Fingerprint

Untersuchen Sie die Forschungsthemen von „Exploring nonnegative matrix factorization for audio classification: Application to speaker recognition“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren