TY - JOUR
T1 - Looking beyond the Simple Scenarios
T2 - Combining Learners and Optimizers in 3D Temporal Tracking
AU - Tan, David Joseph
AU - Navab, Nassir
AU - Tombari, Federico
N1 - Publisher Copyright:
© 1995-2012 IEEE.
PY - 2017/11
Y1 - 2017/11
N2 - 3D object temporal trackers estimate the 3D rotation and 3D translation of a rigid object by propagating the transformation from one frame to the next. To confront this task, algorithms either learn the transformation between two consecutive frames or optimize an energy function to align the object to the scene. The motivation behind our approach stems from a consideration on the nature of learners and optimizers. Throughout the evaluation of different types of objects and working conditions, we observe their complementary nature - on one hand, learners are more robust when undergoing challenging scenarios, while optimizers are prone to tracking failures due to the entrapment at local minima; on the other, optimizers can converge to a better accuracy and minimize jitter. Therefore, we propose to bridge the gap between learners and optimizers to attain a robust and accurate RGB-D temporal tracker that runs at approximately 2 ms per frame using one CPU core. Our work is highly suitable for Augmented Reality (AR), Mixed Reality (MR) and Virtual Reality (VR) applications due to its robustness, accuracy, efficiency and low latency. Aiming at stepping beyond the simple scenarios used by current systems, often constrained by having a single object in the absence of clutter, averting to touch the object to prevent close-range partial occlusion or selecting brightly colored objects to easily segment them individually, we demonstrate the capacity to handle challenging cases under clutter, partial occlusion and varying lighting conditions.
AB - 3D object temporal trackers estimate the 3D rotation and 3D translation of a rigid object by propagating the transformation from one frame to the next. To confront this task, algorithms either learn the transformation between two consecutive frames or optimize an energy function to align the object to the scene. The motivation behind our approach stems from a consideration on the nature of learners and optimizers. Throughout the evaluation of different types of objects and working conditions, we observe their complementary nature - on one hand, learners are more robust when undergoing challenging scenarios, while optimizers are prone to tracking failures due to the entrapment at local minima; on the other, optimizers can converge to a better accuracy and minimize jitter. Therefore, we propose to bridge the gap between learners and optimizers to attain a robust and accurate RGB-D temporal tracker that runs at approximately 2 ms per frame using one CPU core. Our work is highly suitable for Augmented Reality (AR), Mixed Reality (MR) and Virtual Reality (VR) applications due to its robustness, accuracy, efficiency and low latency. Aiming at stepping beyond the simple scenarios used by current systems, often constrained by having a single object in the absence of clutter, averting to touch the object to prevent close-range partial occlusion or selecting brightly colored objects to easily segment them individually, we demonstrate the capacity to handle challenging cases under clutter, partial occlusion and varying lighting conditions.
KW - 3D Tracking
KW - 6D Pose Estimation
KW - Random Forest
UR - http://www.scopus.com/inward/record.url?scp=85028472049&partnerID=8YFLogxK
U2 - 10.1109/TVCG.2017.2734539
DO - 10.1109/TVCG.2017.2734539
M3 - Article
C2 - 28809695
AN - SCOPUS:85028472049
SN - 1077-2626
VL - 23
SP - 2399
EP - 2409
JO - IEEE Transactions on Visualization and Computer Graphics
JF - IEEE Transactions on Visualization and Computer Graphics
IS - 11
M1 - 8007238
ER -