TY - JOUR
T1 - SEWA DB
T2 - A Rich Database for Audio-Visual Emotion and Sentiment Research in the Wild
AU - Kossaifi, Jean
AU - Walecki, Robert
AU - Panagakis, Yannis
AU - Shen, Jie
AU - Schmitt, Maximilian
AU - Ringeval, Fabien
AU - Han, Jing
AU - Pandit, Vedhas
AU - Toisoul, Antoine
AU - Schuller, Bjorn
AU - Star, Kam
AU - Hajiyev, Elnar
AU - Pantic, Maja
N1 - Publisher Copyright:
© 1979-2012 IEEE.
PY - 2021/3/1
Y1 - 2021/3/1
N2 - Natural human-computer interaction and audio-visual human behaviour sensing systems, which would achieve robust performance in-the-wild are more needed than ever as digital devices are increasingly becoming an indispensable part of our life. Accurately annotated real-world data are the crux in devising such systems. However, existing databases usually consider controlled settings, low demographic variability, and a single task. In this paper, we introduce the SEWA database of more than 2,000 minutes of audio-visual data of 398 people coming from six cultures, 50 percent female, and uniformly spanning the age range of 18 to 65 years old. Subjects were recorded in two different contexts: while watching adverts and while discussing adverts in a video chat. The database includes rich annotations of the recordings in terms of facial landmarks, facial action units (FAU), various vocalisations, mirroring, and continuously valued valence, arousal, liking, agreement, and prototypic examples of (dis)liking. This database aims to be an extremely valuable resource for researchers in affective computing and automatic human sensing and is expected to push forward the research in human behaviour analysis, including cultural studies. Along with the database, we provide extensive baseline experiments for automatic FAU detection and automatic valence, arousal, and (dis)liking intensity estimation.
AB - Natural human-computer interaction and audio-visual human behaviour sensing systems, which would achieve robust performance in-the-wild are more needed than ever as digital devices are increasingly becoming an indispensable part of our life. Accurately annotated real-world data are the crux in devising such systems. However, existing databases usually consider controlled settings, low demographic variability, and a single task. In this paper, we introduce the SEWA database of more than 2,000 minutes of audio-visual data of 398 people coming from six cultures, 50 percent female, and uniformly spanning the age range of 18 to 65 years old. Subjects were recorded in two different contexts: while watching adverts and while discussing adverts in a video chat. The database includes rich annotations of the recordings in terms of facial landmarks, facial action units (FAU), various vocalisations, mirroring, and continuously valued valence, arousal, liking, agreement, and prototypic examples of (dis)liking. This database aims to be an extremely valuable resource for researchers in affective computing and automatic human sensing and is expected to push forward the research in human behaviour analysis, including cultural studies. Along with the database, we provide extensive baseline experiments for automatic FAU detection and automatic valence, arousal, and (dis)liking intensity estimation.
KW - SEWA
KW - affect analysis
KW - arousal
KW - database
KW - emotion recognition
KW - facial action units
KW - in-the-wild
KW - valence
UR - http://www.scopus.com/inward/record.url?scp=85100826648&partnerID=8YFLogxK
U2 - 10.1109/TPAMI.2019.2944808
DO - 10.1109/TPAMI.2019.2944808
M3 - Article
C2 - 31581074
AN - SCOPUS:85100826648
SN - 0162-8828
VL - 43
SP - 1022
EP - 1040
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 3
M1 - 8854185
ER -