TY - GEN
T1 - Exploring nonnegative matrix factorization for audio classification
T2 - 10. ITG-Fachtagung Sprachkommunikation - 10th ITG Conference on Speech Communication
AU - Joder, Cyril
AU - Schuller, Björn
N1 - Publisher Copyright:
© 2020 Sprachkommunikation - 10. ITG-Fachtagung. All rights reserved.
PY - 2020
Y1 - 2020
N2 - In this paper, we test the use of Nonnegative Matrix Factorization (NMF) for feature extraction in the context of audio classification. NMF calculates a decomposition of the spectrogram into nonnegative factors and has been successfully applied to audio source separation. Thus, it has the potential to be robust to noise disturbances when used for feature calculation. We then introduce two feature sets directly derived from the NMF decomposition. Experiments performed on an 8-class speaker recognition task with Support Vector Machines show that the proposed representations convey complementary information to the baseline MFCC features. Indeed, the use of only the NMF-based descriptors lead to similar results as the reference features, and the combination of these representations yields a significant improvement of the obtained accuracy.
AB - In this paper, we test the use of Nonnegative Matrix Factorization (NMF) for feature extraction in the context of audio classification. NMF calculates a decomposition of the spectrogram into nonnegative factors and has been successfully applied to audio source separation. Thus, it has the potential to be robust to noise disturbances when used for feature calculation. We then introduce two feature sets directly derived from the NMF decomposition. Experiments performed on an 8-class speaker recognition task with Support Vector Machines show that the proposed representations convey complementary information to the baseline MFCC features. Indeed, the use of only the NMF-based descriptors lead to similar results as the reference features, and the combination of these representations yields a significant improvement of the obtained accuracy.
UR - http://www.scopus.com/inward/record.url?scp=85091341676&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85091341676
T3 - Sprachkommunikation - 10. ITG-Fachtagung
SP - 183
EP - 186
BT - Sprachkommunikation - 10. ITG-Fachtagung
PB - VDE VERLAG GMBH
Y2 - 26 September 2012 through 28 September 2012
ER -