Speech overlap detection using convolutive non-negative sparse coding: New improvements and insights

Jurgen T. Geiger, Ravichander Vipperla, Nicholas Evans, Björn Schuller, Gerhard Rigoll

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

This paper presents recent advances in the application of convolutive non-negative sparse coding (CNSC) to the problem of overlap detection in the context of conference meetings and speaker diarization. CNSC is used to project a mixed speaker signal onto separate speaker bases and hence to detect intervals of competing speech. We present new energy ratio and total energy features which give significant improvements over our previous work. The system is assessed using a subset of the AMI meeting corpus. We report results which are comparable to the state of the art which support the potential of a new approach to overlap detection. An analysis of system performance highlights the importance of further work to addresses weaknesses in detecting particularly short segments of overlapping speech.

Original languageEnglish
Title of host publicationProceedings of the 20th European Signal Processing Conference, EUSIPCO 2012
Pages340-344
Number of pages5
StatePublished - 2012
Event20th European Signal Processing Conference, EUSIPCO 2012 - Bucharest, Romania
Duration: 27 Aug 201231 Aug 2012

Publication series

NameEuropean Signal Processing Conference
ISSN (Print)2219-5491

Conference

Conference20th European Signal Processing Conference, EUSIPCO 2012
Country/TerritoryRomania
CityBucharest
Period27/08/1231/08/12

Keywords

  • convolutive non-negative sparse coding
  • speaker diarization
  • speech overlap detection

Fingerprint

Dive into the research topics of 'Speech overlap detection using convolutive non-negative sparse coding: New improvements and insights'. Together they form a unique fingerprint.

Cite this