TY - GEN
T1 - Personalised Deep Learning for Monitoring Depressed Mood from Speech
AU - Gerczuk, Maurice
AU - Triantafyllopoulos, Andreas
AU - Amiriparian, Shahin
AU - Kathan, Alexander
AU - Bauer, Jonathan
AU - Berking, Matthias
AU - Schuller, Björn
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - We utilise a longitudinal dataset of 17 526 speech samples collected from 30 patients with major depressive disorder and 11 sub-clinically depressed individuals to perform a personalised prediction of depressed mood. The data has been recorded via a smartphone app over a two-week ecological momentary assessment with three recording sessions per day. Each session's speech samples are accompanied by a self-assessed rating on the discrete visual analogue mood scale (VAMS) from 0-10. As these ratings are highly subjective, a personalised machine learning method is leveraged. For this purpose, the beginning of the recording period is utilised to train both a shared model backbone, and adapt personalised layers added at the end to each speaker's speech. Our approach yields a Spearman's correlation coefficient (ρ) of 0.79 on the test set, compared to the non-personalised baseline of ρ=0.61. Furthermore, we analyse our results with regard to the type of speech sample – reading three depression-related questions, answering them, and freely formulating an uplifting spontaneous thought. Here, we find that personalisation boosts performance across all types, especially for the fixed content question readings. Overall, our work highlights the efficacy of personalised machine learning for depressed mood monitoring.
AB - We utilise a longitudinal dataset of 17 526 speech samples collected from 30 patients with major depressive disorder and 11 sub-clinically depressed individuals to perform a personalised prediction of depressed mood. The data has been recorded via a smartphone app over a two-week ecological momentary assessment with three recording sessions per day. Each session's speech samples are accompanied by a self-assessed rating on the discrete visual analogue mood scale (VAMS) from 0-10. As these ratings are highly subjective, a personalised machine learning method is leveraged. For this purpose, the beginning of the recording period is utilised to train both a shared model backbone, and adapt personalised layers added at the end to each speaker's speech. Our approach yields a Spearman's correlation coefficient (ρ) of 0.79 on the test set, compared to the non-personalised baseline of ρ=0.61. Furthermore, we analyse our results with regard to the type of speech sample – reading three depression-related questions, answering them, and freely formulating an uplifting spontaneous thought. Here, we find that personalisation boosts performance across all types, especially for the fixed content question readings. Overall, our work highlights the efficacy of personalised machine learning for depressed mood monitoring.
KW - computational paralinguistics
KW - depression
KW - digital health
KW - personalisation
UR - http://www.scopus.com/inward/record.url?scp=85146563069&partnerID=8YFLogxK
U2 - 10.1109/EHB55594.2022.9991737
DO - 10.1109/EHB55594.2022.9991737
M3 - Conference contribution
AN - SCOPUS:85146563069
T3 - 2022 10th E-Health and Bioengineering Conference, EHB 2022
BT - 2022 10th E-Health and Bioengineering Conference, EHB 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 10th E-Health and Bioengineering Conference, EHB 2022
Y2 - 17 November 2022 through 18 November 2022
ER -