TY - JOUR
T1 - Hybrid NN/HMM acoustic modeling techniques for distributed speech recognition
AU - Stadermann, Jan
AU - Rigoll, Gerhard
PY - 2006/8
Y1 - 2006/8
N2 - Distributed speech recognition (DSR) where the recognizer is split up into two parts and connected via a transmission channel offers new perspectives for improving the speech recognition performance in mobile environments. In this work, we present the integration of hybrid acoustic models using tied-posteriors in a distributed environment. A comparison with standard Gaussian models is performed on the AURORA2 task and the WSJ0 task. Word-based HMMs and phoneme-based HMMs are trained for distributed and non-distributed recognition using either MFCC or RASTA-PLP features. The results show that hybrid modeling techniques can outperform standard continuous systems on this task. Especially the tied-posteriors approach is shown to be usable for DSR in a very flexible way since the client can be modified without a change at the server site and vice versa.
AB - Distributed speech recognition (DSR) where the recognizer is split up into two parts and connected via a transmission channel offers new perspectives for improving the speech recognition performance in mobile environments. In this work, we present the integration of hybrid acoustic models using tied-posteriors in a distributed environment. A comparison with standard Gaussian models is performed on the AURORA2 task and the WSJ0 task. Word-based HMMs and phoneme-based HMMs are trained for distributed and non-distributed recognition using either MFCC or RASTA-PLP features. The results show that hybrid modeling techniques can outperform standard continuous systems on this task. Especially the tied-posteriors approach is shown to be usable for DSR in a very flexible way since the client can be modified without a change at the server site and vice versa.
KW - Distributed speech recognition
KW - Hybrid speech recognition
KW - Tied-posteriors
UR - http://www.scopus.com/inward/record.url?scp=33745351889&partnerID=8YFLogxK
U2 - 10.1016/j.specom.2006.01.007
DO - 10.1016/j.specom.2006.01.007
M3 - Article
AN - SCOPUS:33745351889
SN - 0167-6393
VL - 48
SP - 1037
EP - 1046
JO - Speech Communication
JF - Speech Communication
IS - 8
ER -