Abstract
Reliable systems for automatic estimation of the driver's gaze are crucial for reducing the number of traffic fatalities and for many emerging research areas aimed at developing intelligent vehicle-passenger systems. Gaze estimation is a challenging task, especially in environments with varying illumination and reflection properties. Furthermore, there is wide diversity with respect to the appearance of drivers' faces, both in terms of occlusions (e.g. vision aids) and cultural/ethnic backgrounds. For this reason, analysing the face along with contextual information - for example, the vehicle cabin environment - adds another, less subjective signal towards the design of robust systems for passenger gaze estimation. In this paper, we present an integrated approach to jointly model different features for this task. In particular, to improve the fusion of the visually captured environment with the driver's face, we have developed a contextual attention mechanism, X-AWARE, attached directly to the output convolutional layers of InceptionResNetV2 networks. In order to showcase the effectiveness of our approach, we use the Driver Gaze in the Wild dataset, recently released as part of the Eighth Emotion Recognition in the Wild Challenge (EmotiW) challenge. Our best model outperforms the baseline by an absolute of 15.03% in accuracy on the validation set, and improves the previously best reported result by an absolute of 8.72% on the test set.
| Original language | English |
|---|---|
| Title of host publication | ICMI 2020 - Proceedings of the 2020 International Conference on Multimodal Interaction |
| Publisher | Association for Computing Machinery, Inc |
| Pages | 858-867 |
| Number of pages | 10 |
| ISBN (Electronic) | 9781450375818 |
| DOIs | |
| State | Published - 21 Oct 2020 |
| Externally published | Yes |
| Event | 22nd ACM International Conference on Multimodal Interaction, ICMI 2020 - Virtual, Online, Netherlands Duration: 25 Oct 2020 → 29 Oct 2020 |
Publication series
| Name | ICMI 2020 - Proceedings of the 2020 International Conference on Multimodal Interaction |
|---|
Conference
| Conference | 22nd ACM International Conference on Multimodal Interaction, ICMI 2020 |
|---|---|
| Country/Territory | Netherlands |
| City | Virtual, Online |
| Period | 25/10/20 → 29/10/20 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- attention fusion
- context aware
- gaze detection
- in the wild
Fingerprint
Dive into the research topics of 'X-AWARE: ConteXt-AWARE Human-Environment Attention Fusion for Driver Gaze Prediction in the Wild'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver