TY - GEN
T1 - THÖR-Magni
T2 - 2023 IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2023
AU - De Almeida, Tiago Rodrigues
AU - Rudenko, Andrey
AU - Schreiter, Tim
AU - Zhu, Yufei
AU - Maestro, Eduardo Gutierrez
AU - Morillo-Mendez, Lucas
AU - Kucner, Tomasz P.
AU - Martinez Mozos, Oscar
AU - Magnusson, Martin
AU - Palmieri, Luigi
AU - Arras, Kai O.
AU - Lilienthal, Achim J.
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Autonomous systems, that need to operate in human environments and interact with the users, rely on understanding and anticipating human activity and motion. Among the many factors which influence human motion, semantic attributes, such as the roles and ongoing activities of the detected people, provide a powerful cue on their future motion, actions, and intentions. In this work we adapt several popular deep learning models for trajectory prediction with labels corresponding to the roles of the people. To this end we use the novel THÖR-Magni dataset, which captures human activity in industrial settings and includes the relevant semantic labels for people who navigate complex environments, interact with objects and robots, work alone and in groups. In qualitative and quantitative experiments we show that the role-conditioned LSTM, Transformer, GAN and VAE methods can effectively incorporate the semantic categories, better capture the underlying input distribution and therefore produce more accurate motion predictions in terms of Top-K ADE/FDE and log-likelihood metrics.
AB - Autonomous systems, that need to operate in human environments and interact with the users, rely on understanding and anticipating human activity and motion. Among the many factors which influence human motion, semantic attributes, such as the roles and ongoing activities of the detected people, provide a powerful cue on their future motion, actions, and intentions. In this work we adapt several popular deep learning models for trajectory prediction with labels corresponding to the roles of the people. To this end we use the novel THÖR-Magni dataset, which captures human activity in industrial settings and includes the relevant semantic labels for people who navigate complex environments, interact with objects and robots, work alone and in groups. In qualitative and quantitative experiments we show that the role-conditioned LSTM, Transformer, GAN and VAE methods can effectively incorporate the semantic categories, better capture the underlying input distribution and therefore produce more accurate motion predictions in terms of Top-K ADE/FDE and log-likelihood metrics.
KW - deep learning
KW - human motion dataset
KW - human trajectory prediction
UR - http://www.scopus.com/inward/record.url?scp=85182932549&partnerID=8YFLogxK
U2 - 10.1109/ICCVW60793.2023.00234
DO - 10.1109/ICCVW60793.2023.00234
M3 - Conference contribution
AN - SCOPUS:85182932549
T3 - Proceedings - 2023 IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2023
SP - 2192
EP - 2201
BT - Proceedings - 2023 IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 2 October 2023 through 6 October 2023
ER -