Supervised and semi-supervised suppression of background music in monaural speech recordings

Felix Weninger, Jordi Feliu, Bjorn Schuller

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

29 Scopus citations

Abstract

In this paper, we propose a semi-supervised algorithm based on sparse non-negative matrix factorization (NMF) to improve separation of speech from background music in monaural signals. In our approach, fixed speech basis vectors are obtained from training data whereas music bases are estimated on-the-fly to cope with spectral variability while preserving small NMF dimensionality for decreased computation effort. In a large-scale experimental evaluation with 168 speakers from the TIMIT database, we compare the semi-supervised method to supervised NMF with an explicit background music model. Our results reveal that the semi-supervised method outperforms supervised NMF at low speech-to-music ratios, and that sparsity constraints on the music spectra to enforce harmonicity can improve separation performance.

Original languageEnglish
Title of host publication2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings
Pages61-64
Number of pages4
DOIs
StatePublished - 2012
Event2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Kyoto, Japan
Duration: 25 Mar 201230 Mar 2012

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012
Country/TerritoryJapan
CityKyoto
Period25/03/1230/03/12

Keywords

  • non-negative matrix factorization
  • sparse coding
  • speech enhancement
  • supervised source separation

Fingerprint

Dive into the research topics of 'Supervised and semi-supervised suppression of background music in monaural speech recordings'. Together they form a unique fingerprint.

Cite this