TY - JOUR
T1 - Stream fusion for multi-stream automatic speech recognition
AU - Sagha, Hesam
AU - Li, Feipeng
AU - Variani, Ehsan
AU - Millán, José del R.
AU - Chavarriaga, Ricardo
AU - Schuller, Björn
N1 - Publisher Copyright:
© 2016, Springer Science+Business Media New York.
PY - 2016/12/1
Y1 - 2016/12/1
N2 - Multi-stream automatic speech recognition (MS-ASR) has been confirmed to boost the recognition performance in noisy conditions. In this system, the generation and the fusion of the streams are the essential parts and need to be designed in such a way to reduce the effect of noise on the final decision. This paper shows how to improve the performance of the MS-ASR by targeting two questions; (1) How many streams are to be combined, and (2) how to combine them. First, we propose a novel approach based on stream reliability to select the number of streams to be fused. Second, a fusion method based on Parallel Hidden Markov Models is introduced. Applying the method on two datasets (TIMIT and RATS) with different noises, we show an improvement of MS-ASR.
AB - Multi-stream automatic speech recognition (MS-ASR) has been confirmed to boost the recognition performance in noisy conditions. In this system, the generation and the fusion of the streams are the essential parts and need to be designed in such a way to reduce the effect of noise on the final decision. This paper shows how to improve the performance of the MS-ASR by targeting two questions; (1) How many streams are to be combined, and (2) how to combine them. First, we propose a novel approach based on stream reliability to select the number of streams to be fused. Second, a fusion method based on Parallel Hidden Markov Models is introduced. Applying the method on two datasets (TIMIT and RATS) with different noises, we show an improvement of MS-ASR.
KW - Classifier ensemble creation and fusion
KW - Multi-stream speech recognition
KW - Performance monitor
UR - http://www.scopus.com/inward/record.url?scp=84982962183&partnerID=8YFLogxK
U2 - 10.1007/s10772-016-9357-1
DO - 10.1007/s10772-016-9357-1
M3 - Article
AN - SCOPUS:84982962183
SN - 1381-2416
VL - 19
SP - 669
EP - 675
JO - International Journal of Speech Technology
JF - International Journal of Speech Technology
IS - 4
ER -