TY - GEN
T1 - Exploring Shapely Values for Blood Glucose Level Prediction from Speech
AU - Pompe, Simone
AU - Mallol-Ragolta, Adria
AU - Schauer, Nicolas
AU - Schuller, Björn W.
N1 - Publisher Copyright:
© VDE VERLAG GMBH Berlin Offenbach.
PY - 2023
Y1 - 2023
N2 - We explore a novel dataset for blood glucose level prediction from self-recorded speech. The dataset contains 10 h 30 m 25 s of data from 63 German patients (44 f, 19 m). We model the paralinguistic information embedded in the voice, exploiting the Low-Level Descriptors (LLD) of the eGeMAPS feature set. We investigate the use of Shapely values to understand the contribution of each individual LLD on the inferences produced by a Support Vector Machine (SVM). We also compare the performance of subsets of the LLDs selected by the Shapely values, or transformed using Principal Component Analysis (PCA). We tackle the task as a 3-class classification problem with the Unweighted Average Recall (UAR) as the evaluation metric. The baseline SVM model scores a UAR of 51.8 % on the test partition. The best model selecting a subset of the LLDs based on the Shapely values obtains a UAR of 56.8 %, while the top model transforming the LLDs with PCA reaches a UAR of 42.0 %, both on the test partition.
AB - We explore a novel dataset for blood glucose level prediction from self-recorded speech. The dataset contains 10 h 30 m 25 s of data from 63 German patients (44 f, 19 m). We model the paralinguistic information embedded in the voice, exploiting the Low-Level Descriptors (LLD) of the eGeMAPS feature set. We investigate the use of Shapely values to understand the contribution of each individual LLD on the inferences produced by a Support Vector Machine (SVM). We also compare the performance of subsets of the LLDs selected by the Shapely values, or transformed using Principal Component Analysis (PCA). We tackle the task as a 3-class classification problem with the Unweighted Average Recall (UAR) as the evaluation metric. The baseline SVM model scores a UAR of 51.8 % on the test partition. The best model selecting a subset of the LLDs based on the Shapely values obtains a UAR of 56.8 %, while the top model transforming the LLDs with PCA reaches a UAR of 42.0 %, both on the test partition.
UR - http://www.scopus.com/inward/record.url?scp=85183580674&partnerID=8YFLogxK
U2 - 10.30420/456164015
DO - 10.30420/456164015
M3 - Conference contribution
AN - SCOPUS:85183580674
T3 - Speech Communication - 15th ITG Conference
SP - 81
EP - 85
BT - Speech Communication - 15th ITG Conference
PB - VDE VERLAG GMBH
T2 - 15th ITG Conference on Speech Communication
Y2 - 22 September 2023 through 24 September 2023
ER -