TY - JOUR
T1 - Frustration recognition from speech during game interaction using wide residual networks
AU - Song, Meishu
AU - Mallol-Ragolta, Adria
AU - Parada-Cabaleiro, Emilia
AU - Yang, Zijiang
AU - Liu, Shuo
AU - Ren, Zhao
AU - Zhao, Ziping
AU - Schuller, Björn W.
N1 - Publisher Copyright:
© 2019 Beijing Zhongke Journal Publishing Co. Ltd
PY - 2021/2
Y1 - 2021/2
N2 - Background: Although frustration is a common emotional reaction while playing games, an excessive level of frustration can negatively impact a user's experience, discouraging them from further game interactions. The automatic detection of frustration can enable the development of adaptive systems that can adapt a game to a user's specific needs through real-time difficulty adjustment, thereby optimizing the player's experience and guaranteeing game success. To this end, we present a speech-based approach for the automatic detection of frustration during game interactions, a specific task that remains underexplored in research. Method: The experiments were performed on the Multimodal Game Frustration Database (MGFD), an audiovisual dataset—collected within the Wizard-of-Oz framework—that is specially tailored to investigate verbal and facial expressions of frustration during game interactions. We explored the performance of a variety of acoustic feature sets, including Mel-Spectrograms, Mel-Frequency Cepstral Coefficients (MFCCs), and the low-dimensional knowledge-based acoustic feature set eGeMAPS. Because of the continual improvements in speech recognition tasks achieved by the use of convolutional neural networks (CNNs), unlike the MGFD baseline, which is based on the Long Short-Term Memory (LSTM) architecture and Support Vector Machine (SVM) classifier—in the present work, we consider typical CNNs, including ResNet, VGG, and AlexNet. Furthermore, given the unresolved debate on the suitability of shallow and deep networks, we also examine the performance of two of the latest deep CNNs: WideResNet and EfficientNet. Results: Our best result, achieved with WideResNet and Mel-Spectrogram features, increases the system performance from 58.8% unweighted average recall (UAR) to 93.1% UAR for speech-based automatic frustration recognition.
AB - Background: Although frustration is a common emotional reaction while playing games, an excessive level of frustration can negatively impact a user's experience, discouraging them from further game interactions. The automatic detection of frustration can enable the development of adaptive systems that can adapt a game to a user's specific needs through real-time difficulty adjustment, thereby optimizing the player's experience and guaranteeing game success. To this end, we present a speech-based approach for the automatic detection of frustration during game interactions, a specific task that remains underexplored in research. Method: The experiments were performed on the Multimodal Game Frustration Database (MGFD), an audiovisual dataset—collected within the Wizard-of-Oz framework—that is specially tailored to investigate verbal and facial expressions of frustration during game interactions. We explored the performance of a variety of acoustic feature sets, including Mel-Spectrograms, Mel-Frequency Cepstral Coefficients (MFCCs), and the low-dimensional knowledge-based acoustic feature set eGeMAPS. Because of the continual improvements in speech recognition tasks achieved by the use of convolutional neural networks (CNNs), unlike the MGFD baseline, which is based on the Long Short-Term Memory (LSTM) architecture and Support Vector Machine (SVM) classifier—in the present work, we consider typical CNNs, including ResNet, VGG, and AlexNet. Furthermore, given the unresolved debate on the suitability of shallow and deep networks, we also examine the performance of two of the latest deep CNNs: WideResNet and EfficientNet. Results: Our best result, achieved with WideResNet and Mel-Spectrogram features, increases the system performance from 58.8% unweighted average recall (UAR) to 93.1% UAR for speech-based automatic frustration recognition.
KW - Frustration recognition
KW - Machine learning
KW - WideResNets
UR - http://www.scopus.com/inward/record.url?scp=85115849354&partnerID=8YFLogxK
U2 - 10.1016/j.vrih.2020.10.004
DO - 10.1016/j.vrih.2020.10.004
M3 - Article
AN - SCOPUS:85115849354
SN - 2096-5796
VL - 3
SP - 76
EP - 86
JO - Virtual Reality and Intelligent Hardware
JF - Virtual Reality and Intelligent Hardware
IS - 1
ER -