TY - GEN
T1 - A comparison of acoustic and linguistics methodologies for Alzheimer's dementia recognition
AU - Cummins, Nicholas
AU - Pan, Yilin
AU - Ren, Zhao
AU - Fritsch, Julian
AU - Nallanthighal, Venkata Srikanth
AU - Christensen, Heidi
AU - Blackburn, Daniel
AU - Schuller, Björn W.
AU - Magimai-Doss, Mathew
AU - Strik, Helmer
AU - Härmä, Aki
N1 - Publisher Copyright:
Copyright © 2020 ISCA
PY - 2020
Y1 - 2020
N2 - In the light of the current COVID-19 pandemic, the need for remote digital health assessment tools is greater than ever. This statement is especially pertinent for elderly and vulnerable populations. In this regard, the INTERSPEECH 2020 Alzheimer's Dementia Recognition through Spontaneous Speech (ADReSS) Challenge offers competitors the opportunity to develop speech and language-based systems for the task of Alzheimer's Dementia (AD) recognition. The challenge data consists of speech recordings and their transcripts, the work presented herein is an assessment of different contemporary approaches on these modalities. Specifically, we compared a hierarchical neural network with an attention mechanism trained on linguistic features with three acoustic-based systems: (i) Bag-of-Audio-Words (BoAW) quantising different low-level descriptors, (ii) a Siamese Network trained on log-Mel spectrograms, and (iii) a Convolutional Neural Network (CNN) end-to-end system trained on raw waveforms. Key results indicate the strength of the linguistic approach over the acoustics systems. Our strongest test-set result was achieved using a late fusion combination of BoAW, End-to-End CNN, and hierarchical-attention networks, which outperformed the challenge baseline in both the classification and regression tasks.
AB - In the light of the current COVID-19 pandemic, the need for remote digital health assessment tools is greater than ever. This statement is especially pertinent for elderly and vulnerable populations. In this regard, the INTERSPEECH 2020 Alzheimer's Dementia Recognition through Spontaneous Speech (ADReSS) Challenge offers competitors the opportunity to develop speech and language-based systems for the task of Alzheimer's Dementia (AD) recognition. The challenge data consists of speech recordings and their transcripts, the work presented herein is an assessment of different contemporary approaches on these modalities. Specifically, we compared a hierarchical neural network with an attention mechanism trained on linguistic features with three acoustic-based systems: (i) Bag-of-Audio-Words (BoAW) quantising different low-level descriptors, (ii) a Siamese Network trained on log-Mel spectrograms, and (iii) a Convolutional Neural Network (CNN) end-to-end system trained on raw waveforms. Key results indicate the strength of the linguistic approach over the acoustics systems. Our strongest test-set result was achieved using a late fusion combination of BoAW, End-to-End CNN, and hierarchical-attention networks, which outperformed the challenge baseline in both the classification and regression tasks.
KW - Alzheimer's Disease
KW - Attention Mechanisms
KW - Bag-of-Audio-Words
KW - Convolutional Neural Network
KW - Hierarchical Neural Network
KW - Siamese Network
UR - http://www.scopus.com/inward/record.url?scp=85098104245&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2020-2635
DO - 10.21437/Interspeech.2020-2635
M3 - Conference contribution
AN - SCOPUS:85098104245
SN - 9781713820697
T3 - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
SP - 2182
EP - 2186
BT - Interspeech 2020
PB - International Speech Communication Association
T2 - 21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020
Y2 - 25 October 2020 through 29 October 2020
ER -