TY - JOUR
T1 - Are you sure? Analysing Uncertainty Quantification Approaches for Real-world Speech Emotion Recognition
AU - Schrüfer, Oliver
AU - Milling, Manuel
AU - Burkhardt, Felix
AU - Eyben, Florian
AU - Schuller, Björn
N1 - Publisher Copyright:
© 2024 International Speech Communication Association. All rights reserved.
PY - 2024
Y1 - 2024
N2 - Uncertainty Quantification (UQ) is an important building block for the reliable use of neural networks in real-world scenarios, as it can be a useful tool in identifying faulty predictions. Speech emotion recognition (SER) models can suffer from particularly many sources of uncertainty, such as the ambiguity of emotions, Out-of-Distribution (OOD) data or, in general, poor recording conditions. Reliable UQ methods are thus of particular interest as in many SER applications no prediction is better than a faulty prediction. While the effects of label ambiguity on uncertainty are well documented in the literature, we focus our work on an evaluation of UQ methods for SER under common challenges in real-world application, such as corrupted signals, and the absence of speech. We show that simple UQ methods can already give an indication of the uncertainty of a prediction and that training with additional OOD data can greatly improve the identification of such signals.
AB - Uncertainty Quantification (UQ) is an important building block for the reliable use of neural networks in real-world scenarios, as it can be a useful tool in identifying faulty predictions. Speech emotion recognition (SER) models can suffer from particularly many sources of uncertainty, such as the ambiguity of emotions, Out-of-Distribution (OOD) data or, in general, poor recording conditions. Reliable UQ methods are thus of particular interest as in many SER applications no prediction is better than a faulty prediction. While the effects of label ambiguity on uncertainty are well documented in the literature, we focus our work on an evaluation of UQ methods for SER under common challenges in real-world application, such as corrupted signals, and the absence of speech. We show that simple UQ methods can already give an indication of the uncertainty of a prediction and that training with additional OOD data can greatly improve the identification of such signals.
KW - EDL
KW - Out-of-Distribution
KW - Prior Networks
KW - Speech Emotion Recognition
KW - Uncertainty Quantification
UR - http://www.scopus.com/inward/record.url?scp=85201419352&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2024-977
DO - 10.21437/Interspeech.2024-977
M3 - Conference article
AN - SCOPUS:85201419352
SN - 2308-457X
SP - 3210
EP - 3214
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
T2 - 25th Interspeech Conferece 2024
Y2 - 1 September 2024 through 5 September 2024
ER -