TY - GEN
T1 - Speaking Corona? Human and Machine Recognition of COVID-19 from Voice
AU - Hecker, Pascal
AU - Pokorny, Florian B.
AU - Bartl-Pokorny, Katrin D.
AU - Reichel, Uwe
AU - Ren, Zhao
AU - Hantke, Simone
AU - Eyben, Florian
AU - Schuller, Dagmar M.
AU - Arnrich, Bert
AU - Schuller, Bj¨orn W.
N1 - Publisher Copyright:
Copyright ©2021 ISCA.
PY - 2021
Y1 - 2021
N2 - With the COVID-19 pandemic, several research teams have reported successful advances in automated recognition of COVID-19 by voice. Resulting voice-based screening tools for COVID-19 could support large-scale testing efforts. While capabilities of machines on this task are progressing, we approach the so far unexplored aspect whether human raters can distinguish COVID-19 positive and negative tested speakers from voice samples, and compare their performance to a machine learning baseline. To account for the challenging symptom similarity between COVID-19 and other respiratory diseases, we use a carefully balanced dataset of voice samples, in which COVID-19 positive and negative tested speakers are matched by their symptoms alongside COVID-19 negative speakers without symptoms. Both human raters and the machine struggle to reliably identify COVID-19 positive speakers in our dataset. These results indicate that particular attention should be paid to the distribution of symptoms across all speakers of a dataset when assessing the capabilities of existing systems. The identification of acoustic aspects of COVID-19-related symptom manifestations might be the key for a reliable voice-based COVID-19 detection in the future by both trained human raters and machine learning models.
AB - With the COVID-19 pandemic, several research teams have reported successful advances in automated recognition of COVID-19 by voice. Resulting voice-based screening tools for COVID-19 could support large-scale testing efforts. While capabilities of machines on this task are progressing, we approach the so far unexplored aspect whether human raters can distinguish COVID-19 positive and negative tested speakers from voice samples, and compare their performance to a machine learning baseline. To account for the challenging symptom similarity between COVID-19 and other respiratory diseases, we use a carefully balanced dataset of voice samples, in which COVID-19 positive and negative tested speakers are matched by their symptoms alongside COVID-19 negative speakers without symptoms. Both human raters and the machine struggle to reliably identify COVID-19 positive speakers in our dataset. These results indicate that particular attention should be paid to the distribution of symptoms across all speakers of a dataset when assessing the capabilities of existing systems. The identification of acoustic aspects of COVID-19-related symptom manifestations might be the key for a reliable voice-based COVID-19 detection in the future by both trained human raters and machine learning models.
KW - Auditory disease perception
KW - Automatic disease recognition
KW - Computational paralinguistics
KW - Covid-19
KW - Voice
UR - http://www.scopus.com/inward/record.url?scp=85119302612&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2021-1771
DO - 10.21437/Interspeech.2021-1771
M3 - Conference contribution
AN - SCOPUS:85119302612
T3 - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
SP - 701
EP - 705
BT - 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
PB - International Speech Communication Association
T2 - 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
Y2 - 30 August 2021 through 3 September 2021
ER -