Human Attention in Fine-grained Classification

Yao Rong, Wenjia Xu, Zeynep Akata, Enkelejda Kasneci

Research output: Contribution to conferencePaperpeer-review

5 Scopus citations


The way humans attend to, process and classify a given image has the potential to vastly benefit the performance of deep learning models. Exploiting where humans are focusing can rectify models when they are deviating from essential features for correct decisions. To validate that human attention contains valuable information for decision-making processes such as fine-grained classification, we compare human attention and model explanations in discovering important features. Towards this goal, we collect human gaze data for the fine-grained classification dataset CUB and build a dataset named CUB-GHA (Gaze-based Human Attention). Furthermore, we propose the Gaze Augmentation Training (GAT) and Knowledge Fusion Network (KFN) to integrate human gaze knowledge into classification models. We implement our proposals in CUB-GHA and the recently released medical dataset CXR-Eye of chest X-ray images, which includes gaze data collected from a radiologist. Our result reveals that integrating human attention knowledge benefits classification effectively, e.g. improving the baseline by 4.38% on CXR. Hence, our work provides not only valuable insights into understanding human attention in fine-grained classification, but also contributes to future research in integrating human gaze with computer vision tasks. CUB-GHA and code are available at

Original languageEnglish
StatePublished - 2021
Externally publishedYes
Event32nd British Machine Vision Conference, BMVC 2021 - Virtual, Online
Duration: 22 Nov 202125 Nov 2021


Conference32nd British Machine Vision Conference, BMVC 2021
CityVirtual, Online


Dive into the research topics of 'Human Attention in Fine-grained Classification'. Together they form a unique fingerprint.

Cite this