TY - GEN
T1 - MEC 2017
T2 - 1st Asian Conference on Affective Computing and Intelligent Interaction, ACII Asia 2018
AU - Li, Ya
AU - Tao, Jianhua
AU - Schuller, Bjorn
AU - Shan, Shiguang
AU - Jiang, Dongmei
AU - Jia, Jia
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/9/21
Y1 - 2018/9/21
N2 - This paper introduces baselines for the Multimodal Emotion Recognition Challenge (MEC) 2017, which is a part of the first Asian Conference on Affective Computing and Intelligent Interaction (ACII Asia) 2018. The aim of MEC 2017 is to improve the performance of emotion recognition in real-world conditions. The Chinese Natural Audio-Visual Emotion Database (CHEAVD) 2.0 is utilized as the challenge database, which is an extension of CHEAVD as released in MEC 2016. MEC 2017 has three sub-challenges and 31 teams participate in either all or part of them. 27 teams, 16 teams and 17 teams participate in audio (only), video (only) and multimodal emotion recognition sub-challenges, respectively. Baseline scores of the audio (only) and the video (only) sub-challenges are generated from Support Vector Machines (SVM) where audio features and video features are considered separately. In the multimodal sub-challenge, feature-level fusion and decision-level fusion are both utilized. The baselines of the audio (only), the video (only) and the multimodal sub-challenges are 39.2%, 21.7% and 35.7% in macro average precision.
AB - This paper introduces baselines for the Multimodal Emotion Recognition Challenge (MEC) 2017, which is a part of the first Asian Conference on Affective Computing and Intelligent Interaction (ACII Asia) 2018. The aim of MEC 2017 is to improve the performance of emotion recognition in real-world conditions. The Chinese Natural Audio-Visual Emotion Database (CHEAVD) 2.0 is utilized as the challenge database, which is an extension of CHEAVD as released in MEC 2016. MEC 2017 has three sub-challenges and 31 teams participate in either all or part of them. 27 teams, 16 teams and 17 teams participate in audio (only), video (only) and multimodal emotion recognition sub-challenges, respectively. Baseline scores of the audio (only) and the video (only) sub-challenges are generated from Support Vector Machines (SVM) where audio features and video features are considered separately. In the multimodal sub-challenge, feature-level fusion and decision-level fusion are both utilized. The baselines of the audio (only), the video (only) and the multimodal sub-challenges are 39.2%, 21.7% and 35.7% in macro average precision.
KW - Audio-visual corpus
KW - Emotion recognition challenges
KW - Fusion methods
KW - Multimodal features
UR - http://www.scopus.com/inward/record.url?scp=85054999415&partnerID=8YFLogxK
U2 - 10.1109/ACIIAsia.2018.8470342
DO - 10.1109/ACIIAsia.2018.8470342
M3 - Conference contribution
AN - SCOPUS:85054999415
T3 - 2018 1st Asian Conference on Affective Computing and Intelligent Interaction, ACII Asia 2018
BT - 2018 1st Asian Conference on Affective Computing and Intelligent Interaction, ACII Asia 2018
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 20 May 2018 through 22 May 2018
ER -