TY - GEN
T1 - Preserving actual dynamic trend of emotion in dimensional speech emotion recognition
AU - Han, Wenjing
AU - Li, Haifeng
AU - Eyben, Florian
AU - Ma, Lin
AU - Sun, Jiayin
AU - Schuller, Björn
PY - 2012
Y1 - 2012
N2 - In this paper, we use the concept of dynamic trend of emotion to describe how a human's emotion changes over time, which is believed to be important for understanding one's stance toward current topic in interactions. However, the importance of this concept - to our best knowledge - has not been paid enough attention before in the field of speech emotion recognition (SER). Inspired by this, this paper aims to evoke researchers' attention on this concept and makes a primary effort on the research of predicting correct dynamic trend of emotion in the process of SER. Specifically, we propose a novel algorithm named Order Preserving Network (OPNet) to this end. First, as the key issue for OPNet construction, we propose employing a probabilistic method to define an emotion trend-sensitive loss function. Then, a nonlinear neural network is trained using the gradient descent as optimization algorithm to minimize the constructed loss function. We validated the prediction performance of OPNet on the VAM corpus, by mean linear error as well as a rank correlation coefficient γ as measures. Comparing to k-Nearest Neighbor and support vector regression, the proposed OPNet performs better on the preservation of actual dynamic trend of emotion.
AB - In this paper, we use the concept of dynamic trend of emotion to describe how a human's emotion changes over time, which is believed to be important for understanding one's stance toward current topic in interactions. However, the importance of this concept - to our best knowledge - has not been paid enough attention before in the field of speech emotion recognition (SER). Inspired by this, this paper aims to evoke researchers' attention on this concept and makes a primary effort on the research of predicting correct dynamic trend of emotion in the process of SER. Specifically, we propose a novel algorithm named Order Preserving Network (OPNet) to this end. First, as the key issue for OPNet construction, we propose employing a probabilistic method to define an emotion trend-sensitive loss function. Then, a nonlinear neural network is trained using the gradient descent as optimization algorithm to minimize the constructed loss function. We validated the prediction performance of OPNet on the VAM corpus, by mean linear error as well as a rank correlation coefficient γ as measures. Comparing to k-Nearest Neighbor and support vector regression, the proposed OPNet performs better on the preservation of actual dynamic trend of emotion.
KW - Dynamic trend of emotion
KW - Loss function
KW - Neural network
KW - Speech emotion recognition
UR - http://www.scopus.com/inward/record.url?scp=84870216464&partnerID=8YFLogxK
U2 - 10.1145/2388676.2388786
DO - 10.1145/2388676.2388786
M3 - Conference contribution
AN - SCOPUS:84870216464
SN - 9781450314671
T3 - ICMI'12 - Proceedings of the ACM International Conference on Multimodal Interaction
SP - 523
EP - 528
BT - ICMI'12 - Proceedings of the ACM International Conference on Multimodal Interaction
T2 - 14th ACM International Conference on Multimodal Interaction, ICMI 2012
Y2 - 22 October 2012 through 26 October 2012
ER -