TY - GEN
T1 - Predicting Biological Signals from Speech
T2 - 21st IEEE International Workshop on Multimedia Signal Processing, MMSP 2019
AU - Baird, Alice
AU - Amiriparian, Shahin
AU - Berschneider, Miriam
AU - Schmitt, Maximilian
AU - Schuller, Björn
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/9
Y1 - 2019/9
N2 - In recent years, diagnosis and awareness of mental health conditions, e. g., chronic stress, have been increasing globally. Biological signals can be an effective way to monitor such conditions, yet acquisition can be cumbersome and invasive. Alternatively, acoustic features offer non-invasive and efficient monitoring of an array of health and wellbeing characteristics. This study presents the BioSpeech Database (BioS-DB), a novel database of audio and biological signals - blood volume pulse (BVP) and skin conductance (SC) - from 55 individuals speaking aloud in front of others, whilst having their emotional state annotated in real time. Through a variation of conventional and state-of-the-art approaches, initial experiments have shown for the first time that acoustic features can be applied for the task of BVP prediction. Notably, using deep representations of audio and a sequence-to-sequence auto-encoders with a GRU-RNN as a time-dependent regressor achieved at best 0.075 and 0.123 RMSE for [0; 1] normalised BVP and SC, respectively.
AB - In recent years, diagnosis and awareness of mental health conditions, e. g., chronic stress, have been increasing globally. Biological signals can be an effective way to monitor such conditions, yet acquisition can be cumbersome and invasive. Alternatively, acoustic features offer non-invasive and efficient monitoring of an array of health and wellbeing characteristics. This study presents the BioSpeech Database (BioS-DB), a novel database of audio and biological signals - blood volume pulse (BVP) and skin conductance (SC) - from 55 individuals speaking aloud in front of others, whilst having their emotional state annotated in real time. Through a variation of conventional and state-of-the-art approaches, initial experiments have shown for the first time that acoustic features can be applied for the task of BVP prediction. Notably, using deep representations of audio and a sequence-to-sequence auto-encoders with a GRU-RNN as a time-dependent regressor achieved at best 0.075 and 0.123 RMSE for [0; 1] normalised BVP and SC, respectively.
UR - http://www.scopus.com/inward/record.url?scp=85075739673&partnerID=8YFLogxK
U2 - 10.1109/MMSP.2019.8901758
DO - 10.1109/MMSP.2019.8901758
M3 - Conference contribution
AN - SCOPUS:85075739673
T3 - IEEE 21st International Workshop on Multimedia Signal Processing, MMSP 2019
BT - IEEE 21st International Workshop on Multimedia Signal Processing, MMSP 2019
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 27 September 2019 through 29 September 2019
ER -