Convolutive non-negative sparse coding and new features for speech overlap handling in speaker diarization

Jürgen T. Geiger, Ravichander Vipperla, Simon Bozonnet, Nicholas Evans, Björn Schuller, Gerhard Rigoll

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

14 Scopus citations

Abstract

The effective handling of overlapping speech is at the limits of the current state of the art in speaker diarization. This paper presents our latest work in overlap detection. We report the combination of features derived through convolutive nonnegative sparse coding and new energy, spectral and voicingrelated features within a conventional HMM system. Overlap detection results are fully integrated into our top-down diarization system through the application of overlap exclusion and overlap labeling. Experiments on a subset of the AMI corpus show that the new system delivers significant reductions in missed speech and speaker error. Through overlap exclusion and labelling the overall diarization error rate is shown to improve by 6.4 % relative.

Original languageEnglish
Title of host publication13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Pages2151-2154
Number of pages4
StatePublished - 2012
Event13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 - Portland, OR, United States
Duration: 9 Sep 201213 Sep 2012

Publication series

Name13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Volume3

Conference

Conference13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Country/TerritoryUnited States
CityPortland, OR
Period9/09/1213/09/12

Keywords

  • Convolutive nonnegative sparse coding
  • Speaker diarization
  • Speech overlap detection

Fingerprint

Dive into the research topics of 'Convolutive non-negative sparse coding and new features for speech overlap handling in speaker diarization'. Together they form a unique fingerprint.

Cite this