Two-stage speaker adaptation of hybrid tied-posterior acoustic models

Jan Stadermann, Gerhard Rigoll

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

33 Scopus citations

Abstract

For Gaussian distribution-acoustic models there exist many established technologies for speaker adaptation. Contrary to that, there are almost no well-functioning adaptation methods for hybrid systems, consisting of a combination of HMMs and neural networks. In this paper, strategies are explored to adapt hybrid NN/HMM systems based on the tied-posterior paradigm. We investigate the retraining of selected important parts of the neural network and a gradient based adaptation strategy for the HMM's mixture co-efficients based on maximizing the scaled likelihood. The paper presents the following innovations: First it introduces one of the first adaptation methods for hybrid systems where the HMM component contributes significantly to the adaptation success. Second, it presents a novel approach to the neural network's adaptation, based on the selection of suitable neurons for adaptation. Results on the WSJ speaker adaptation test show the capability of our methods to adapt to new speakers especially in case of adapting the neural net and that both methods can be combined to achieve additional improvement of the word error rate in most cases.

Original languageEnglish
Title of host publication2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Proceedings - Image and Multidimensional Signal Processing Multimedia Signal Processing
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages977-980
Number of pages4
ISBN (Print)0780388747, 9780780388741
DOIs
StatePublished - 2005
Event2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Philadelphia, PA, United States
Duration: 18 Mar 200523 Mar 2005

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
VolumeI
ISSN (Print)1520-6149

Conference

Conference2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05
Country/TerritoryUnited States
CityPhiladelphia, PA
Period18/03/0523/03/05

Fingerprint

Dive into the research topics of 'Two-stage speaker adaptation of hybrid tied-posterior acoustic models'. Together they form a unique fingerprint.

Cite this