Dense Coordinate Channel Attention Network for Depression Level Estimation from Speech

Ziping Zhao, Shizhao Liu, Mingyue Niu, Haishuai Wang, Björn W. Schuller

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Automatic depression level estimation from speech is currently an active research topic in the field of computational emotion recognition. One symptom commonly exhibited by patients with depression is erratic speech volume; thus, patients’ voices can be used as a bio-signature to identify their level of depression. However, speech signals have time-frequency properties; different frequencies and different timestamps contribute to depression detection in different ways. Accordingly, we design a Coordinate Channel Attention (CCA) block for differentiating tensor information with different contributions. We use a dense block to extract profound speech features with the above-mentioned blocks to form our proposed Dense Coordinate Channel Attention Network (DCCANet). Subsequently, a vectorization block is utilized to fuse the high-dimensional information. We split the original long speech into short audio segments of equal length, then feed these short segments into the network after feature extraction to determine BDI-II scores. Ultimately, the mean of the scores is used as the individual’s depression level. Experiments on both the AVEC2013 and AVEC2014 datasets prove the effectiveness of DCCANet, which outperforms several existing methods.

Original languageEnglish
Title of host publicationPattern Recognition - 27th International Conference, ICPR 2024, Proceedings
EditorsApostolos Antonacopoulos, Subhasis Chaudhuri, Rama Chellappa, Cheng-Lin Liu, Saumik Bhattacharya, Umapada Pal
PublisherSpringer Science and Business Media Deutschland GmbH
Pages402-413
Number of pages12
ISBN (Print)9783031782008
DOIs
StatePublished - 2025
Event27th International Conference on Pattern Recognition, ICPR 2024 - Kolkata, India
Duration: 1 Dec 20245 Dec 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15313 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference27th International Conference on Pattern Recognition, ICPR 2024
Country/TerritoryIndia
CityKolkata
Period1/12/245/12/24

Keywords

  • Coordinate Channel Attention
  • Depression Level Estimation
  • feature extraction
  • speech signals
  • time-frequency properties

Fingerprint

Dive into the research topics of 'Dense Coordinate Channel Attention Network for Depression Level Estimation from Speech'. Together they form a unique fingerprint.

Cite this