Discrimination of speech and monophonic singing in continuous audio streams applying multi-layer support vector machines

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

In this paper we present a novel approach to the discrimination of speech and monophonic singing for the use in Music Information Retrieval applications. A working prototype is introduced applying Multi-Layer Support Vector Machines for the discrimination, and static high-level features derived of the pitch and energy contours of an acoustic signal. The feature set for the discrimination is presented and ranked according to a Linear Discriminant Analysis. For the automatic segmentation within an input signal stream a further feature set is used for the discrimination of signal and noise. A corpus for training and evaluation comprising speech and monophonic singing data of nine performers is described in detail. The data has been labeled according to the judgments of another set of probands. A recognition rate of correct assignments of 99.2 % could be reached, and demonstrates the high performance of the proposed methods.

Original languageEnglish
Title of host publication2004 IEEE International Conference on Multimedia and Expo (ICME)
Pages1655-1658
Number of pages4
StatePublished - 2004
Event2004 IEEE International Conference on Multimedia and Expo (ICME) - Taipei, Taiwan, Province of China
Duration: 27 Jun 200430 Jun 2004

Publication series

Name2004 IEEE International Conference on Multimedia and Expo (ICME)
Volume3

Conference

Conference2004 IEEE International Conference on Multimedia and Expo (ICME)
Country/TerritoryTaiwan, Province of China
CityTaipei
Period27/06/0430/06/04

Fingerprint

Dive into the research topics of 'Discrimination of speech and monophonic singing in continuous audio streams applying multi-layer support vector machines'. Together they form a unique fingerprint.

Cite this