Abstract
In this paper we present a context dependent hybrid MMI-connectionist/Hidden Markov Model (HMM) speech recognition system for the Wall Street Journal (WSJ) database. The hybrid system is build with a neural network, which is used as a vector quantizer (VQ) and an HMM with discrete probablility density functions, which has the advantage of a faster decoding. The neural network is trained on an algorithm, that tries to maximize the mutual information between the classes of the input features (e.g. phones, triphones, etc.) and the neural firing sequence of the network. The system has been trained on the 1992 WSJ corpus (si-84). Tests were performed on the five- and twentythousand word, speaker independent (si_et) tasks. The error rates of a new context dependend neural network are 29% lower (relative) than the error rates of a standard (k-means) discrete system and the error rates are very close to the best continuous/semi-continuous HMM speech recognizers.
Original language | English |
---|---|
Pages | 79-82 |
Number of pages | 4 |
State | Published - 1997 |
Externally published | Yes |
Event | 5th European Conference on Speech Communication and Technology, EUROSPEECH 1997 - Rhodes, Greece Duration: 22 Sep 1997 → 25 Sep 1997 |
Conference
Conference | 5th European Conference on Speech Communication and Technology, EUROSPEECH 1997 |
---|---|
Country/Territory | Greece |
City | Rhodes |
Period | 22/09/97 → 25/09/97 |