LONG-TERM ACTION ANTICIPATION BASED ON CONTEXTUAL ALIGNMENT

Constantin Patsch, Jinghan Zhang, Yuankai Wu, Marsil Zakour, Driton Salihu, Eckehard Steinbach

Publikation: Beitrag in Buch/Bericht/KonferenzbandKonferenzbeitragBegutachtung

Abstract

In action anticipation, the model predicts the next future action after a certain observation period. In long-term action anticipation, this idea is further extended to predicting multiple actions and their respective duration. Thus, in this problem setting the model should not only capture relationships between past actions but also predict several future actions that fit into a certain context. Compared to autoregressive models, our model employs an encoder decoder structure to determine future actions and durations in parallel, which prevents the accumulation of prediction errors and reduces the inference time. Furthermore, it is ensured that the predicted actions are aligned with respect to a context representation, which resembles the way humans approach this task as the feasible action set is restricted by the respective context. We evaluate our model on the long-term anticipation benchmark datasets, Breakfast, and 50Salads, where we achieve state-of-the-art results.

OriginalspracheEnglisch
Titel2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024 - Proceedings
Herausgeber (Verlag)Institute of Electrical and Electronics Engineers Inc.
Seiten5920-5924
Seitenumfang5
ISBN (elektronisch)9798350344851
DOIs
PublikationsstatusVeröffentlicht - 2024
Veranstaltung49th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024 - Seoul, Südkorea
Dauer: 14 Apr. 202419 Apr. 2024

Publikationsreihe

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Konferenz

Konferenz49th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024
Land/GebietSüdkorea
OrtSeoul
Zeitraum14/04/2419/04/24

Fingerprint

Untersuchen Sie die Forschungsthemen von „LONG-TERM ACTION ANTICIPATION BASED ON CONTEXTUAL ALIGNMENT“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren