Self-Attention Based Action Segmentation Using Intra-And Inter-Segment Representations

Constantin Patsch, Eckehard Steinbach

Publikation: Beitrag in FachzeitschriftKonferenzartikelBegutachtung

2 Zitate (Scopus)

Abstract

Segmenting activities in untrimmed videos remains a critical challenge to fully understand complex human activity sequences. A correct representation of temporal action relations is key for improving incorrect segmentations. We propose a self-attention-based model that refines initial segmentations by separately considering intra-as well as inter-segment relations between predicted action segments. Furthermore, in order to enhance the training process, we use a similarity-guided regularization technique that ensures intra-segment similarity and the validity of action transitions between adjacent segments. In an extensive evaluation on three public datasets -Georgia Tech Egocentric Activities, 50Salads, and Breakfast -our proposed architecture enhances the backbone model by 6.1% on GTEA, 3.8% on 50Salads, and 3.9% on Breakfast with regard to the F 1@50 metric.

OriginalspracheEnglisch
FachzeitschriftICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
DOIs
PublikationsstatusVeröffentlicht - 2023
Veranstaltung48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023 - Rhodes Island, Griechenland
Dauer: 4 Juni 202310 Juni 2023

Fingerprint

Untersuchen Sie die Forschungsthemen von „Self-Attention Based Action Segmentation Using Intra-And Inter-Segment Representations“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren