TY - GEN
T1 - APPLYING SPEECH DERIVED BREATHING PATTERNS TO AUTOMATICALLY CLASSIFY HUMAN CONFIDENCE
AU - Deshpande, Gauri
AU - Gudipalli, Yagna
AU - Patel, Sachin
AU - Schuller, Björn W.
N1 - Publisher Copyright:
© 2023 European Signal Processing Conference, EUSIPCO. All rights reserved.
PY - 2023
Y1 - 2023
N2 - Non-verbal expressions of speech are used to understand a spectrum of human behaviour parameters; one of them being confidence. Several speech representation techniques, from hand-crafted features to auto-encoder representations, are explored for mining such information. We introduce a deep network trained with 100 speakers' data for the extraction of breathing patterns from the speech signals. This network gives an average Pearson's correlation coefficient of 0.61 and a breaths-per-minute error of 2.5 across 100 speakers. In this paper, we propose the novel use of speech-derived breathing patterns as the feature set for the binary classification of confidence levels. The classification model trained with the data from 51 interview candidates gives an average AUC of 76 % in classifying the confident speakers from the non-confident ones using breathing patterns as the feature set. On comparing this performance with that of Mel frequency cepstral coefficients and auto-encoder representations, we observe an absolute improvement of 8 % and 5 % respectively.
AB - Non-verbal expressions of speech are used to understand a spectrum of human behaviour parameters; one of them being confidence. Several speech representation techniques, from hand-crafted features to auto-encoder representations, are explored for mining such information. We introduce a deep network trained with 100 speakers' data for the extraction of breathing patterns from the speech signals. This network gives an average Pearson's correlation coefficient of 0.61 and a breaths-per-minute error of 2.5 across 100 speakers. In this paper, we propose the novel use of speech-derived breathing patterns as the feature set for the binary classification of confidence levels. The classification model trained with the data from 51 interview candidates gives an average AUC of 76 % in classifying the confident speakers from the non-confident ones using breathing patterns as the feature set. On comparing this performance with that of Mel frequency cepstral coefficients and auto-encoder representations, we observe an absolute improvement of 8 % and 5 % respectively.
KW - affective computing
KW - computational paralinguistics
KW - human confidence classification
KW - speech-breathing
KW - time-series analysis
UR - http://www.scopus.com/inward/record.url?scp=85178337767&partnerID=8YFLogxK
U2 - 10.23919/EUSIPCO58844.2023.10289872
DO - 10.23919/EUSIPCO58844.2023.10289872
M3 - Conference contribution
AN - SCOPUS:85178337767
T3 - European Signal Processing Conference
SP - 1335
EP - 1339
BT - 31st European Signal Processing Conference, EUSIPCO 2023 - Proceedings
PB - European Signal Processing Conference, EUSIPCO
T2 - 31st European Signal Processing Conference, EUSIPCO 2023
Y2 - 4 September 2023 through 8 September 2023
ER -