Speech overlap detection using convolutive non-negative sparse coding: New improvements and insights

Jurgen T. Geiger, Ravichander Vipperla, Nicholas Evans, Björn Schuller, Gerhard Rigoll

Publikation: Beitrag in Buch/Bericht/KonferenzbandKonferenzbeitragBegutachtung

8 Zitate (Scopus)

Abstract

This paper presents recent advances in the application of convolutive non-negative sparse coding (CNSC) to the problem of overlap detection in the context of conference meetings and speaker diarization. CNSC is used to project a mixed speaker signal onto separate speaker bases and hence to detect intervals of competing speech. We present new energy ratio and total energy features which give significant improvements over our previous work. The system is assessed using a subset of the AMI meeting corpus. We report results which are comparable to the state of the art which support the potential of a new approach to overlap detection. An analysis of system performance highlights the importance of further work to addresses weaknesses in detecting particularly short segments of overlapping speech.

OriginalspracheEnglisch
TitelProceedings of the 20th European Signal Processing Conference, EUSIPCO 2012
Seiten340-344
Seitenumfang5
PublikationsstatusVeröffentlicht - 2012
Veranstaltung20th European Signal Processing Conference, EUSIPCO 2012 - Bucharest, Rumänien
Dauer: 27 Aug. 201231 Aug. 2012

Publikationsreihe

NameEuropean Signal Processing Conference
ISSN (Print)2219-5491

Konferenz

Konferenz20th European Signal Processing Conference, EUSIPCO 2012
Land/GebietRumänien
OrtBucharest
Zeitraum27/08/1231/08/12

Fingerprint

Untersuchen Sie die Forschungsthemen von „Speech overlap detection using convolutive non-negative sparse coding: New improvements and insights“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren