X-AWARE: ConteXt-AWARE Human-Environment Attention Fusion for Driver Gaze Prediction in the Wild

Lukas Stappen, Georgios Rizos, Björn Schuller

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

17 Scopus citations

Abstract

Reliable systems for automatic estimation of the driver's gaze are crucial for reducing the number of traffic fatalities and for many emerging research areas aimed at developing intelligent vehicle-passenger systems. Gaze estimation is a challenging task, especially in environments with varying illumination and reflection properties. Furthermore, there is wide diversity with respect to the appearance of drivers' faces, both in terms of occlusions (e.g. vision aids) and cultural/ethnic backgrounds. For this reason, analysing the face along with contextual information - for example, the vehicle cabin environment - adds another, less subjective signal towards the design of robust systems for passenger gaze estimation. In this paper, we present an integrated approach to jointly model different features for this task. In particular, to improve the fusion of the visually captured environment with the driver's face, we have developed a contextual attention mechanism, X-AWARE, attached directly to the output convolutional layers of InceptionResNetV2 networks. In order to showcase the effectiveness of our approach, we use the Driver Gaze in the Wild dataset, recently released as part of the Eighth Emotion Recognition in the Wild Challenge (EmotiW) challenge. Our best model outperforms the baseline by an absolute of 15.03% in accuracy on the validation set, and improves the previously best reported result by an absolute of 8.72% on the test set.

Original languageEnglish
Title of host publicationICMI 2020 - Proceedings of the 2020 International Conference on Multimodal Interaction
PublisherAssociation for Computing Machinery, Inc
Pages858-867
Number of pages10
ISBN (Electronic)9781450375818
DOIs
StatePublished - 21 Oct 2020
Externally publishedYes
Event22nd ACM International Conference on Multimodal Interaction, ICMI 2020 - Virtual, Online, Netherlands
Duration: 25 Oct 202029 Oct 2020

Publication series

NameICMI 2020 - Proceedings of the 2020 International Conference on Multimodal Interaction

Conference

Conference22nd ACM International Conference on Multimodal Interaction, ICMI 2020
Country/TerritoryNetherlands
CityVirtual, Online
Period25/10/2029/10/20

Keywords

  • attention fusion
  • context aware
  • gaze detection
  • in the wild

Fingerprint

Dive into the research topics of 'X-AWARE: ConteXt-AWARE Human-Environment Attention Fusion for Driver Gaze Prediction in the Wild'. Together they form a unique fingerprint.

Cite this