TY - JOUR
T1 - Cross-domain classification of drowsiness in speech
T2 - 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017
AU - Zhang, Yue
AU - Weninger, Felix
AU - Schuller, Björn W.
N1 - Publisher Copyright:
Copyright © 2017 ISCA.
PY - 2017
Y1 - 2017
N2 - In this work, we study the drowsy state of a speaker, induced by alcohol intoxication or sleep deprivation. In particular, we investigate the coherence between the two pivotal causes of drowsiness, as featured in the Intoxication and Sleepiness tasks of the INTERSPEECH Speaker State Challenge. In this way, we aim to exploit the interrelations between these different, yet highly correlated speaker states, which need to be reliably recognised in safety and security critical environments. To this end, we perform cross-domain classification of alcohol intoxication and sleepiness, thus leveraging the acoustic similarities of these speech phenomena for transfer learning. Further, we conducted in-depth feature analysis to quantitatively assess the task relatedness and to determine the most relevant features for both tasks. To test our methods in realistic contexts, we use the Alcohol Language Corpus and the Sleepy Language Corpus containing in total 60 hours of genuine intoxicated and sleepy speech. In the result, cross-domain classification combined with feature selection yields up to 60.3% unweighted average recall, which is significantly above-chance (50 %) and highly notable given the mismatch in the training and validation data. Finally, we show that an effective, general drowsiness classifier can be obtained by aggregating the training data from both domains.
AB - In this work, we study the drowsy state of a speaker, induced by alcohol intoxication or sleep deprivation. In particular, we investigate the coherence between the two pivotal causes of drowsiness, as featured in the Intoxication and Sleepiness tasks of the INTERSPEECH Speaker State Challenge. In this way, we aim to exploit the interrelations between these different, yet highly correlated speaker states, which need to be reliably recognised in safety and security critical environments. To this end, we perform cross-domain classification of alcohol intoxication and sleepiness, thus leveraging the acoustic similarities of these speech phenomena for transfer learning. Further, we conducted in-depth feature analysis to quantitatively assess the task relatedness and to determine the most relevant features for both tasks. To test our methods in realistic contexts, we use the Alcohol Language Corpus and the Sleepy Language Corpus containing in total 60 hours of genuine intoxicated and sleepy speech. In the result, cross-domain classification combined with feature selection yields up to 60.3% unweighted average recall, which is significantly above-chance (50 %) and highly notable given the mismatch in the training and validation data. Finally, we show that an effective, general drowsiness classifier can be obtained by aggregating the training data from both domains.
KW - Computational Paralinguistics
KW - Drowsiness detection
KW - Feature analysis
KW - Speaker states
KW - Transfer learning
UR - http://www.scopus.com/inward/record.url?scp=85029517474&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2017-1015
DO - 10.21437/Interspeech.2017-1015
M3 - Conference article
AN - SCOPUS:85029517474
SN - 2308-457X
VL - 2017-August
SP - 3152
EP - 3156
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Y2 - 20 August 2017 through 24 August 2017
ER -