TY - GEN
T1 - Fusing Visual Appearance and Geometry for Multi-Modality 6DoF Object Tracking
AU - Stoiber, Manuel
AU - Elsayed, Mariam
AU - Reichert, Anne E.
AU - Steidle, Florian
AU - Lee, Dongheui
AU - Triebel, Rudolph
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - In many applications of advanced robotic manipulation, six degrees of freedom (6DoF) object pose estimates are continuously required. In this work, we develop a multi-modality tracker that fuses information from visual appearance and geometry to estimate object poses. The algorithm extends our previous method ICG, which uses geometry, to additionally consider surface appearance. In general, object surfaces contain local characteristics from text, graphics, and patterns, as well as global differences from distinct materials and colors. To incorporate this visual information, two modalities are developed. For local characteristics, keypoint features are used to minimize distances between points from keyframes and the current image. For global differences, a novel region approach is developed that considers multiple regions on the object surface. In addition, it allows the modeling of external geometries. Experiments on the YCB-Video and OPT datasets demonstrate that our approach ICG+ performs best on both datasets, outperforming both conventional and deep learning-based methods. At the same time, the algorithm is highly efficient and runs at more than 300 Hz. The source code of our tracker is publicly available.
AB - In many applications of advanced robotic manipulation, six degrees of freedom (6DoF) object pose estimates are continuously required. In this work, we develop a multi-modality tracker that fuses information from visual appearance and geometry to estimate object poses. The algorithm extends our previous method ICG, which uses geometry, to additionally consider surface appearance. In general, object surfaces contain local characteristics from text, graphics, and patterns, as well as global differences from distinct materials and colors. To incorporate this visual information, two modalities are developed. For local characteristics, keypoint features are used to minimize distances between points from keyframes and the current image. For global differences, a novel region approach is developed that considers multiple regions on the object surface. In addition, it allows the modeling of external geometries. Experiments on the YCB-Video and OPT datasets demonstrate that our approach ICG+ performs best on both datasets, outperforming both conventional and deep learning-based methods. At the same time, the algorithm is highly efficient and runs at more than 300 Hz. The source code of our tracker is publicly available.
UR - http://www.scopus.com/inward/record.url?scp=85182525605&partnerID=8YFLogxK
U2 - 10.1109/IROS55552.2023.10341961
DO - 10.1109/IROS55552.2023.10341961
M3 - Conference contribution
AN - SCOPUS:85182525605
T3 - IEEE International Conference on Intelligent Robots and Systems
SP - 1170
EP - 1177
BT - 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2023
Y2 - 1 October 2023 through 5 October 2023
ER -