TY - GEN
T1 - Score-informed leading voice separation from monaural audio
AU - Joder, Cyril
AU - Schuller, Björn
PY - 2012
Y1 - 2012
N2 - Separating the leading voice from a musical recording seems to be natural to the human ear. Yet, it remains a difficult problem for automatic systems, in particular in the blind case, where no information is known about the signal. However, in the case where a musical score is available, one can take advantage of this additional information. In this paper, we present a novel application of this idea for leading voice separation exploiting a temporally-aligned MIDI Score. The model used is based on Nonnegative Matrix Factorization (NMF), whose solo part is represented by a source-filter model. We exploit the score information by constraining the source activations to conform to the aligned MIDI file. Experiments run on a database of real popular songs show that the use of these constraints can significantly improve the separation quality, in terms of both signal-based and perceptual evaluation metrics.
AB - Separating the leading voice from a musical recording seems to be natural to the human ear. Yet, it remains a difficult problem for automatic systems, in particular in the blind case, where no information is known about the signal. However, in the case where a musical score is available, one can take advantage of this additional information. In this paper, we present a novel application of this idea for leading voice separation exploiting a temporally-aligned MIDI Score. The model used is based on Nonnegative Matrix Factorization (NMF), whose solo part is represented by a source-filter model. We exploit the score information by constraining the source activations to conform to the aligned MIDI file. Experiments run on a database of real popular songs show that the use of these constraints can significantly improve the separation quality, in terms of both signal-based and perceptual evaluation metrics.
UR - http://www.scopus.com/inward/record.url?scp=84873444645&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84873444645
SN - 9789727521449
T3 - Proceedings of the 13th International Society for Music Information Retrieval Conference, ISMIR 2012
SP - 277
EP - 282
BT - Proceedings of the 13th International Society for Music Information Retrieval Conference, ISMIR 2012
T2 - 13th International Society for Music Information Retrieval Conference, ISMIR 2012
Y2 - 8 October 2012 through 12 October 2012
ER -