GRAtt-VIS: Gated Residual Attention for Video Instance Segmentation

Tanveer Hannan, Rajat Koner, Maximilian Bernhard, Suprosanna Shit, Bjoern Menze, Volker Tresp, Matthias Schubert, Thomas Seidl

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Video Instance Segmentation (VIS) has seen a growing reliance on query propagation-based methods to model complex and lengthy videos. While these methods dominate the performance, they do not explicitly model discrete events, e.g., occlusion, disappearance, and reappearance. Such events often results in degraded object features over time. We believe learning these events end-to-end with the propagation network would prevent the degradation. To this end, we propose a novel propagation method that models these discrete events with a gating mechanism. First, the gate identifies degraded object features caused by these events. Second, we apply a residual configuration to rectify the feature degradation, alleviating the need for a conventional memory bank. Third, we restrict interaction between relevant and degraded objects with a novel gated self-attention. The gated residual configuration and self-attention forms GRAtt block, which can easily be integrated into the existing propagation frameworks. GRAtt-VIS performs on par with the state-of-the-art methods on YTVIS-19,-21,-22 and challenging OVIS datasets by significantly improving performance over previous methods. The code is available in the supplementary.

Original languageEnglish
Title of host publicationPattern Recognition - 27th International Conference, ICPR 2024, Proceedings
EditorsApostolos Antonacopoulos, Subhasis Chaudhuri, Rama Chellappa, Cheng-Lin Liu, Saumik Bhattacharya, Umapada Pal
PublisherSpringer Science and Business Media Deutschland GmbH
Pages268-282
Number of pages15
ISBN (Print)9783031784439
DOIs
StatePublished - 2025
Externally publishedYes
Event27th International Conference on Pattern Recognition, ICPR 2024 - Kolkata, India
Duration: 1 Dec 20245 Dec 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15316 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference27th International Conference on Pattern Recognition, ICPR 2024
Country/TerritoryIndia
CityKolkata
Period1/12/245/12/24

Keywords

  • Multi Object Tracking
  • Video Instance Segmentation

Fingerprint

Dive into the research topics of 'GRAtt-VIS: Gated Residual Attention for Video Instance Segmentation'. Together they form a unique fingerprint.

Cite this