Probabilistic asr feature extraction applying context-sensitive connectionist temporal classification networks

Publikation: Beitrag in Buch/Bericht/KonferenzbandKonferenzbeitragBegutachtung

1 Zitat (Scopus)

Abstract

This paper proposes a novel automatic speech recognition (ASR) front-end that unites the principles of bidirectional Long Short-Term Memory (BLSTM), Connectionist Temporal Classification (CTC), and Bottleneck (BN) feature generation. BLSTM networks are known to produce better probabilistic ASR features than conventional multilayer perceptrons since they are able to exploit a self-learned amount of temporal context for phoneme estimation. Combining BLSTM networks with a CTC output layer implies the advantage that the network can be trained on unsegmented data so that the quality of phoneme prediction does not rely on potentially error-prone forced alignment segmentations of the training set. In challenging ASR scenarios involving highly spontaneous, disfluent, and noisy speech, our BN-CTC front-end leads to remarkable word accuracy improvements and prevails over a series of previously introduced BLSTM-based ASR systems.

OriginalspracheEnglisch
Titel2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings
Seiten7125-7129
Seitenumfang5
DOIs
PublikationsstatusVeröffentlicht - 18 Okt. 2013
Veranstaltung2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Vancouver, BC, Kanada
Dauer: 26 Mai 201331 Mai 2013

Publikationsreihe

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Konferenz

Konferenz2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013
Land/GebietKanada
OrtVancouver, BC
Zeitraum26/05/1331/05/13

Fingerprint

Untersuchen Sie die Forschungsthemen von „Probabilistic asr feature extraction applying context-sensitive connectionist temporal classification networks“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren