TY - JOUR
T1 - Multi-Type Outer Product-Based Fusion of Respiratory Sounds for Detecting COVID-19
AU - Mallol-Ragolta, Adria
AU - Cuesta, Helena
AU - Gómez, Emilia
AU - Schuller, Björn W.
N1 - Publisher Copyright:
Copyright © 2022 ISCA.
PY - 2022
Y1 - 2022
N2 - This work presents an outer product-based approach to fuse the embedded representations learnt from the spectrograms of cough, breath, and speech samples for the automatic detection of COVID-19. To extract deep learnt representations from the spectrograms, we compare the performance of specific Convolutional Neural Networks (CNNs) trained from scratch and ResNet18-based CNNs fine-tuned for the task at hand. Furthermore, we investigate whether the patients' sex and the use of contextual attention mechanisms are beneficial. Our experiments use the dataset released as part of the Second Diagnosing COVID-19 using Acoustics (DiCOVA) Challenge. The results suggest the suitability of fusing breath and speech information to detect COVID-19. An Area Under the Curve (AUC) of 84.06 % is obtained on the test partition when using specific CNNs trained from scratch with contextual attention mechanisms. When using ResNet18-based CNNs for feature extraction, the baseline model scores the highest performance with an AUC of 84.26 %.
AB - This work presents an outer product-based approach to fuse the embedded representations learnt from the spectrograms of cough, breath, and speech samples for the automatic detection of COVID-19. To extract deep learnt representations from the spectrograms, we compare the performance of specific Convolutional Neural Networks (CNNs) trained from scratch and ResNet18-based CNNs fine-tuned for the task at hand. Furthermore, we investigate whether the patients' sex and the use of contextual attention mechanisms are beneficial. Our experiments use the dataset released as part of the Second Diagnosing COVID-19 using Acoustics (DiCOVA) Challenge. The results suggest the suitability of fusing breath and speech information to detect COVID-19. An Area Under the Curve (AUC) of 84.06 % is obtained on the test partition when using specific CNNs trained from scratch with contextual attention mechanisms. When using ResNet18-based CNNs for feature extraction, the baseline model scores the highest performance with an AUC of 84.26 %.
KW - COVID-19 Detection
KW - Healthcare
KW - Information Fusion
KW - Respiratory Diagnosis
KW - Transfer Learning
UR - http://www.scopus.com/inward/record.url?scp=85140074468&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2022-10291
DO - 10.21437/Interspeech.2022-10291
M3 - Conference article
AN - SCOPUS:85140074468
SN - 2308-457X
VL - 2022-September
SP - 2163
EP - 2167
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
T2 - 23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022
Y2 - 18 September 2022 through 22 September 2022
ER -