Robust in-car spelling recognition - A tandem BLSTM-HMM approach

Martin Wöllmer, Florian Eyben, Björn Schuller, Yang Sun, Tobias Moosmayr, Nhu Nguyen-Thien

Research output: Contribution to journalConference articlepeer-review

17 Scopus citations

Abstract

As an intuitive hands-free input modality automatic spelling recognition is especially useful for in-car human-machine interfaces. However, for today's speech recognition engines it is extremely challenging to cope with similar sounding spelling speech sequences in the presence of noises such as the driving noise inside a car. Thus, we propose a novel Tandem spelling recogniser, combining a Hidden Markov Model (HMM) with a discriminatively trained bidirectional Long Short-Term Memory (BLSTM) recurrent neural net. The BLSTM network captures long-range temporal dependencies to learn the properties of in-car noise, which makes the Tandem BLSTM-HMM robust with respect to speech signal disturbances at extremely low signal-to-noise ratios and mismatches between training and test noise conditions. Experiments considering various driving conditions reveal that our Tandem recogniser outperforms a conventional HMM by up to 33%.

Original languageEnglish
Pages (from-to)2507-2510
Number of pages4
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
StatePublished - 2009
Event10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009 - Brighton, United Kingdom
Duration: 6 Sep 200910 Sep 2009

Keywords

  • Long short-term memory
  • Noise robustness
  • Recurrent neural networks
  • Spelling recognition

Fingerprint

Dive into the research topics of 'Robust in-car spelling recognition - A tandem BLSTM-HMM approach'. Together they form a unique fingerprint.

Cite this