Probabilistic speech feature extraction with context-sensitive Bottleneck neural networks

Martin Wöllmer, Björn Schuller

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

We introduce a novel context-sensitive feature extraction approach for spontaneous speech recognition. As bidirectional Long Short-Term Memory (BLSTM) networks are known to enable improved phoneme recognition accuracies by incorporating long-range contextual information into speech decoding, we integrate the BLSTM principle into a Tandem front-end for probabilistic feature extraction. Unlike the previously proposed approaches which exploit BLSTM modeling by generating a discrete phoneme prediction feature, our feature extractor merges continuous high-level probabilistic BLSTM features with low-level features. By combining BLSTM modeling and Bottleneck (BN) feature generation, we propose a novel front-end that allows us to produce context-sensitive probabilistic feature vectors of arbitrary size, independent of the network training targets. Evaluations on challenging spontaneous, conversational speech recognition tasks show that this concept prevails over recently published architectures for feature-level context modeling.

Original languageEnglish
Pages (from-to)113-120
Number of pages8
JournalNeurocomputing
Volume132
DOIs
StatePublished - 20 May 2014

Keywords

  • Bidirectional speech processing
  • Bottleneck networks
  • Long Short-Term Memory
  • Probabilistic feature extraction

Fingerprint

Dive into the research topics of 'Probabilistic speech feature extraction with context-sensitive Bottleneck neural networks'. Together they form a unique fingerprint.

Cite this