TY - GEN
T1 - Language proficiency assessment of English L2 speakers based on joint analysis of prosody and native language
AU - Zhang, Yue
AU - Weninger, Felix
AU - Batliner, Anton
AU - Schuller, Bjorn
AU - Honig, Florian
N1 - Publisher Copyright:
© 2016 ACM.
PY - 2016/10/31
Y1 - 2016/10/31
N2 - In this work, wc present an in-depth analysis of the intcr- dcpcndcncy between the non-native prosody and the native language (LI) of English L2 speakers, as separately investigated in the Degree of Nativcncss Task and the Native Language Task of the INTERSPEECH 2015 and 2016 Computational Paralinguistics ChallcngE (ComParE). To this end, wc propose a multi-Task learning scheme based on auxiliary attributes for jointly learning the tasks of L1 classification and prosody score regression. The effectiveness of this approach is demonstrated in extensive experimental runs, comparing various standardised feature sets of prosodic, ccpstral, spectral, and voice quality descriptors, as well as automatic feature selection. In the result, we show that the prediction of both prosody score and L1 can be improved by considering both tasks in a holistic way. In particular, wc achieve an 11 % relative gain in regression performance (Spearman's correlation coefficient) on prosody scores, when comparing the best multi-And single-Task learning results.
AB - In this work, wc present an in-depth analysis of the intcr- dcpcndcncy between the non-native prosody and the native language (LI) of English L2 speakers, as separately investigated in the Degree of Nativcncss Task and the Native Language Task of the INTERSPEECH 2015 and 2016 Computational Paralinguistics ChallcngE (ComParE). To this end, wc propose a multi-Task learning scheme based on auxiliary attributes for jointly learning the tasks of L1 classification and prosody score regression. The effectiveness of this approach is demonstrated in extensive experimental runs, comparing various standardised feature sets of prosodic, ccpstral, spectral, and voice quality descriptors, as well as automatic feature selection. In the result, we show that the prediction of both prosody score and L1 can be improved by considering both tasks in a holistic way. In particular, wc achieve an 11 % relative gain in regression performance (Spearman's correlation coefficient) on prosody scores, when comparing the best multi-And single-Task learning results.
KW - Feature evaluation
KW - L1 identification
KW - Non-native prosody
UR - http://www.scopus.com/inward/record.url?scp=85016570434&partnerID=8YFLogxK
U2 - 10.1145/2993148.2993155
DO - 10.1145/2993148.2993155
M3 - Conference contribution
AN - SCOPUS:85016570434
T3 - ICMI 2016 - Proceedings of the 18th ACM International Conference on Multimodal Interaction
SP - 274
EP - 278
BT - ICMI 2016 - Proceedings of the 18th ACM International Conference on Multimodal Interaction
A2 - Pelachaud, Catherine
A2 - Nakano, Yukiko I.
A2 - Nishida, Toyoaki
A2 - Busso, Carlos
A2 - Morency, Louis-Philippe
A2 - Andre, Elisabeth
PB - Association for Computing Machinery, Inc
T2 - 18th ACM International Conference on Multimodal Interaction, ICMI 2016
Y2 - 12 November 2016 through 16 November 2016
ER -