Modeling Action Spatiotemporal Relationships Using Graph-Based Class-Level Attention Network for Long-Term Action Detection

Yuankai Wu, Xin Su, Driton Salihu, Hao Xing, Marsil Zakour, Constantin Patsch

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

In recent years, Action Detection has become an active research topic in various fields such as human-robot interaction and assistive robots. Most of the previous methods in this field focus on temporally processing the action representation, without considering the dependencies among the action classes. However, actions that occur in a video are constantly related, and this correlation could offer effective clues for detection tasks. In this work, we propose to exploit the information of related action classes with the help of a graph neural network in conjunction with temporal modeling. We introduce the attention-based temporal class module (ATC), which models the inherent action dependencies on the graph and learns action-specific features among temporal dimensions with a dual-branch attention mechanism. Further, we present the Graph-based Class-level Attention Network (GCAN), which is built upon ATC modules with increasing temporal receptive fields to handle actions instances in complex untrimmed videos. Our network is evaluated on two challenging benchmark datasets with dense annotations: Charades and MultiTHUMOS. Experimental results show that our approach demonstrates highly competitive results with a significantly reduced model complexity.

Original languageEnglish
Title of host publication2023 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages6719-6726
Number of pages8
ISBN (Electronic)9781665491907
DOIs
StatePublished - 2023
Event2023 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2023 - Detroit, United States
Duration: 1 Oct 20235 Oct 2023

Publication series

NameIEEE International Conference on Intelligent Robots and Systems
ISSN (Print)2153-0858
ISSN (Electronic)2153-0866

Conference

Conference2023 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2023
Country/TerritoryUnited States
CityDetroit
Period1/10/235/10/23

Fingerprint

Dive into the research topics of 'Modeling Action Spatiotemporal Relationships Using Graph-Based Class-Level Attention Network for Long-Term Action Detection'. Together they form a unique fingerprint.

Cite this