TY - GEN
T1 - The DiCOVA 2021 challenge - An encoder-decoder approach for COVID-19 recognition from coughing audio
AU - Deshpande, Gauri
AU - Schuller, Björn W.
N1 - Publisher Copyright:
Copyright © 2021 ISCA.
PY - 2021
Y1 - 2021
N2 - This paper presents the automatic recognition of COVID-19 from coughing. In particular, it describes our contribution to the DiCOVA challenge - Track 1, which addresses such cough sound analysis for COVID-19 detection. Pathologically, the effects of a COVID-19 infection on the respiratory system and on breathing patterns are known. We demonstrate the use of breathing patterns of the cough audio signal in identifying the COVID-19 status. Breathing patterns of the cough audio signal are derived using a model trained with the subset of the UCL Speech Breath Monitoring (UCL-SBM) database. This database provides speech recordings of the participants while their breathing values are captured by a respiratory belt. We use an encoder-decoder architecture. The encoder encodes the audio signal into breathing patterns and the decoder decodes the COVID-19 status for the corresponding breathing patterns using an attention mechanism. The encoder uses a pre-trained model which predicts breathing patterns from the speech signal, and transfers the learned patterns to cough audio signals. With this architecture, we achieve an AUC of 64:42% on the evaluation set of Track 1.
AB - This paper presents the automatic recognition of COVID-19 from coughing. In particular, it describes our contribution to the DiCOVA challenge - Track 1, which addresses such cough sound analysis for COVID-19 detection. Pathologically, the effects of a COVID-19 infection on the respiratory system and on breathing patterns are known. We demonstrate the use of breathing patterns of the cough audio signal in identifying the COVID-19 status. Breathing patterns of the cough audio signal are derived using a model trained with the subset of the UCL Speech Breath Monitoring (UCL-SBM) database. This database provides speech recordings of the participants while their breathing values are captured by a respiratory belt. We use an encoder-decoder architecture. The encoder encodes the audio signal into breathing patterns and the decoder decodes the COVID-19 status for the corresponding breathing patterns using an attention mechanism. The encoder uses a pre-trained model which predicts breathing patterns from the speech signal, and transfers the learned patterns to cough audio signals. With this architecture, we achieve an AUC of 64:42% on the evaluation set of Track 1.
KW - Acoustics
KW - COVID-19
KW - Healthcare
KW - Machine learning
KW - Respiratory diagnosis
UR - http://www.scopus.com/inward/record.url?scp=85119286616&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2021-811
DO - 10.21437/Interspeech.2021-811
M3 - Conference contribution
AN - SCOPUS:85119286616
T3 - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
SP - 4251
EP - 4255
BT - 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
PB - International Speech Communication Association
T2 - 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
Y2 - 30 August 2021 through 3 September 2021
ER -