TY - JOUR
T1 - Identifying depression-related topics in smartphone-collected free-response speech recordings using an automatic speech recognition system and a deep learning topic model
AU - RADAR-CNS consortium
AU - Zhang, Yuezhou
AU - Folarin, Amos A.
AU - Dineley, Judith
AU - Conde, Pauline
AU - de Angel, Valeria
AU - Sun, Shaoxiong
AU - Ranjan, Yatharth
AU - Rashid, Zulqarnain
AU - Stewart, Callum
AU - Laiou, Petroula
AU - Sankesara, Heet
AU - Qian, Linglong
AU - Matcham, Faith
AU - White, Katie
AU - Oetzmann, Carolin
AU - Lamers, Femke
AU - Siddi, Sara
AU - Simblett, Sara
AU - Schuller, Björn W.
AU - Vairavan, Srinivasan
AU - Wykes, Til
AU - Haro, Josep Maria
AU - Penninx, Brenda W.J.H.
AU - Narayan, Vaibhav A.
AU - Hotopf, Matthew
AU - Dobson, Richard J.B.
AU - Cummins, Nicholas
N1 - Publisher Copyright:
© 2024 The Author(s)
PY - 2024/6/15
Y1 - 2024/6/15
N2 - Background: Prior research has associated spoken language use with depression, yet studies often involve small or non-clinical samples and face challenges in the manual transcription of speech. This paper aimed to automatically identify depression-related topics in speech recordings collected from clinical samples. Methods: The data included 3919 English free-response speech recordings collected via smartphones from 265 participants with a depression history. We transcribed speech recordings via automatic speech recognition (Whisper tool, OpenAI) and identified principal topics from transcriptions using a deep learning topic model (BERTopic). To identify depression risk topics and understand the context, we compared participants' depression severity and behavioral (extracted from wearable devices) and linguistic (extracted from transcribed texts) characteristics across identified topics. Results: From the 29 topics identified, we identified 6 risk topics for depression: ‘No Expectations’, ‘Sleep’, ‘Mental Therapy’, ‘Haircut’, ‘Studying’, and ‘Coursework’. Participants mentioning depression risk topics exhibited higher sleep variability, later sleep onset, and fewer daily steps and used fewer words, more negative language, and fewer leisure-related words in their speech recordings. Limitations: Our findings were derived from a depressed cohort with a specific speech task, potentially limiting the generalizability to non-clinical populations or other speech tasks. Additionally, some topics had small sample sizes, necessitating further validation in larger datasets. Conclusion: This study demonstrates that specific speech topics can indicate depression severity. The employed data-driven workflow provides a practical approach for analyzing large-scale speech data collected from real-world settings.
AB - Background: Prior research has associated spoken language use with depression, yet studies often involve small or non-clinical samples and face challenges in the manual transcription of speech. This paper aimed to automatically identify depression-related topics in speech recordings collected from clinical samples. Methods: The data included 3919 English free-response speech recordings collected via smartphones from 265 participants with a depression history. We transcribed speech recordings via automatic speech recognition (Whisper tool, OpenAI) and identified principal topics from transcriptions using a deep learning topic model (BERTopic). To identify depression risk topics and understand the context, we compared participants' depression severity and behavioral (extracted from wearable devices) and linguistic (extracted from transcribed texts) characteristics across identified topics. Results: From the 29 topics identified, we identified 6 risk topics for depression: ‘No Expectations’, ‘Sleep’, ‘Mental Therapy’, ‘Haircut’, ‘Studying’, and ‘Coursework’. Participants mentioning depression risk topics exhibited higher sleep variability, later sleep onset, and fewer daily steps and used fewer words, more negative language, and fewer leisure-related words in their speech recordings. Limitations: Our findings were derived from a depressed cohort with a specific speech task, potentially limiting the generalizability to non-clinical populations or other speech tasks. Additionally, some topics had small sample sizes, necessitating further validation in larger datasets. Conclusion: This study demonstrates that specific speech topics can indicate depression severity. The employed data-driven workflow provides a practical approach for analyzing large-scale speech data collected from real-world settings.
KW - Automatic speech recognition
KW - Depression
KW - Smartphone
KW - Speech
KW - Topic modeling
UR - http://www.scopus.com/inward/record.url?scp=85188991330&partnerID=8YFLogxK
U2 - 10.1016/j.jad.2024.03.106
DO - 10.1016/j.jad.2024.03.106
M3 - Article
C2 - 38552911
AN - SCOPUS:85188991330
SN - 0165-0327
VL - 355
SP - 40
EP - 49
JO - Journal of Affective Disorders
JF - Journal of Affective Disorders
ER -