Frame-discriminative and confidence-driven adaptation for LVCSR

Frank Wallhoff, Daniel Willett, Gerhard Rigoll

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

31 Scopus citations

Abstract

Maximum likelihood linear regression (MLLR) has become the most popular approach for adapting speaker-independent hidden Markov models to a specific speaker's characteristics. However, it is well known, that discriminative training objectives outperform maximum likelihood training approaches, especially in cases where training data is very limited, as it always is the case in adaptation tasks. Therefore, this paper explores the application of a frame-based discriminative training objective for adaptation. It presents evaluations for supervised as well as for unsupervised adaption on the 1993 WSJ adaptation tests of native and non-native speakers. Relative improvements in word error rate of up to 25% could be measured compared to the MLLR adapted recognition systems. Along with unsupervised adaptation, the paper also presents the improvements achieved by the application of confidence measures. They provided an average relative improvement of 10% compared to ordinary unsupervised MLLR.

Original languageEnglish
Title of host publicationSpeech Processing II
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1835-1838
Number of pages4
ISBN (Electronic)0780362934
DOIs
StatePublished - 2000
Externally publishedYes
Event25th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2000 - Istanbul, Turkey
Duration: 5 Jun 20009 Jun 2000

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume3
ISSN (Print)1520-6149

Conference

Conference25th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2000
Country/TerritoryTurkey
CityIstanbul
Period5/06/009/06/00

Fingerprint

Dive into the research topics of 'Frame-discriminative and confidence-driven adaptation for LVCSR'. Together they form a unique fingerprint.

Cite this