TY - GEN
T1 - Personalized estimation of engagement from videos using active learning with deep reinforcement learning
AU - Rudovic, Ognjen
AU - Park, Hae Won
AU - Busche, John
AU - Schuller, Bjorn
AU - Breazeal, Cynthia
AU - Picard, Rosalind W.
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/6
Y1 - 2019/6
N2 - Perceiving users' engagement accurately is important for technologies that need to respond to learners in a natural and intelligent way. In this paper, we address the problem of automated estimation of engagement from videos of child-robot interactions recorded in unconstrained environments (kindergartens). This is challenging due to diverse and person-specific styles of engagement expressions through facial and body gestures, as well as because of illumination changes, partial occlusion, and a changing background in the classroom as each child is active. To tackle these difficult challenges, we propose a novel deep reinforcement learning architecture for active learning and estimation of engagement from video data. The key to our approach is the learning of a personalized policy that enables the model to decide whether to estimate the child's engagement level (low, medium, high) or, when uncertain, to query a human for a video label. Queried videos are labeled by a human expert in an offline manner, and used to personalize the policy and engagement classifier to a target child over time. We show on a database of 43 children involved in robot-assisted learning activities (8 sessions over 3 months), that this combined human-AI approach can easily adapt its interpretations of engagement to the target child using only a handful of labeled videos, while being robust to the many complex influences on the data. The results show large improvements over a non-personalized approach and over traditional active learning methods.
AB - Perceiving users' engagement accurately is important for technologies that need to respond to learners in a natural and intelligent way. In this paper, we address the problem of automated estimation of engagement from videos of child-robot interactions recorded in unconstrained environments (kindergartens). This is challenging due to diverse and person-specific styles of engagement expressions through facial and body gestures, as well as because of illumination changes, partial occlusion, and a changing background in the classroom as each child is active. To tackle these difficult challenges, we propose a novel deep reinforcement learning architecture for active learning and estimation of engagement from video data. The key to our approach is the learning of a personalized policy that enables the model to decide whether to estimate the child's engagement level (low, medium, high) or, when uncertain, to query a human for a video label. Queried videos are labeled by a human expert in an offline manner, and used to personalize the policy and engagement classifier to a target child over time. We show on a database of 43 children involved in robot-assisted learning activities (8 sessions over 3 months), that this combined human-AI approach can easily adapt its interpretations of engagement to the target child using only a handful of labeled videos, while being robust to the many complex influences on the data. The results show large improvements over a non-personalized approach and over traditional active learning methods.
UR - http://www.scopus.com/inward/record.url?scp=85083203932&partnerID=8YFLogxK
U2 - 10.1109/CVPRW.2019.00031
DO - 10.1109/CVPRW.2019.00031
M3 - Conference contribution
AN - SCOPUS:85083203932
T3 - IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
SP - 217
EP - 226
BT - Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2019
PB - IEEE Computer Society
T2 - 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2019
Y2 - 16 June 2019 through 20 June 2019
ER -