Real-Time Activity Detection in a Multi-Talker Reverberated Environment

Emanuele Principi, Rudy Rotili, Martin Wöllmer, Florian Eyben, Stefano Squartini, Björn Schuller

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

This paper proposes a real-time person activity detection framework operating in presence of multiple sources in reverberated environments. Such a framework is composed by two main parts: The speech enhancement front-end and the activity detector. The aim of the former is to automatically reduce the distortions introduced by room reverberation in the available distant speech signals and thus to achieve a significant improvement of speech quality for each speaker. The overall front-end is composed by three cooperating blocks, each one fulfilling a specific task: Speaker diarization, room impulse responses identification, and speech dereverberation. In particular, the speaker diarization algorithm is essential to pilot the operations performed in the other two stages in accordance with speakers' activity in the room. The activity estimation algorithm is based on bidirectional Long Short-Term Memory networks which allow for context-sensitive activity classification from audio feature functionals extracted via the real-time speech feature extraction toolkit openSMILE. Extensive computer simulations have been performed by using a subset of the AMI database for activity evaluation in meetings: Obtained results confirm the effectiveness of the approach.

Original languageEnglish
Pages (from-to)386-397
Number of pages12
JournalCognitive Computation
Volume4
Issue number4
DOIs
StatePublished - Dec 2012

Keywords

  • Activity detection
  • Blind channel identification
  • Real-time signal processing
  • Speaker diarization
  • Speech dereverberation
  • Speech enhancement

Fingerprint

Dive into the research topics of 'Real-Time Activity Detection in a Multi-Talker Reverberated Environment'. Together they form a unique fingerprint.

Cite this