Self-Attention Based Action Segmentation Using Intra-And Inter-Segment Representations

Constantin Patsch, Eckehard Steinbach

Research output: Contribution to journalConference articlepeer-review

4 Scopus citations

Abstract

Segmenting activities in untrimmed videos remains a critical challenge to fully understand complex human activity sequences. A correct representation of temporal action relations is key for improving incorrect segmentations. We propose a self-attention-based model that refines initial segmentations by separately considering intra-as well as inter-segment relations between predicted action segments. Furthermore, in order to enhance the training process, we use a similarity-guided regularization technique that ensures intra-segment similarity and the validity of action transitions between adjacent segments. In an extensive evaluation on three public datasets -Georgia Tech Egocentric Activities, 50Salads, and Breakfast -our proposed architecture enhances the backbone model by 6.1% on GTEA, 3.8% on 50Salads, and 3.9% on Breakfast with regard to the F 1@50 metric.

Keywords

  • Action Segmentation
  • Activity Recognition
  • Video Understanding

Fingerprint

Dive into the research topics of 'Self-Attention Based Action Segmentation Using Intra-And Inter-Segment Representations'. Together they form a unique fingerprint.

Cite this