Tandem decoding of children's speech for keyword detection in a child-robot interaction scenario

Martin Wöllmer, Björn Schuller, Anton Batliner, Stefan Steidl, Dino Seppi

Publikation: Beitrag in FachzeitschriftArtikelBegutachtung

13 Zitate (Scopus)

Abstract

In this article, we focus on keyword detection in children's speech as it is needed in voice command systems. We use the FAU Aibo Emotion Corpus which contains emotionally colored spontaneous children's speech recorded in a child-robot interaction scenario and investigate various recent keyword spotting techniques. As the principle of bidirectional Long Short-Term Memory (BLSTM) is known to be well-suited for context-sensitive phoneme prediction, we incorporate a BLSTM network into a Tandem model for flexible coarticulation modeling in children's speech. Our experiments reveal that the Tandem model prevails over a triphone-based Hidden Markov Model approach.

OriginalspracheEnglisch
Aufsatznummer12
FachzeitschriftACM Transactions on Speech and Language Processing
Jahrgang7
Ausgabenummer4
DOIs
PublikationsstatusVeröffentlicht - Aug. 2011

Fingerprint

Untersuchen Sie die Forschungsthemen von „Tandem decoding of children's speech for keyword detection in a child-robot interaction scenario“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren