TY - GEN
T1 - MID-fusion
T2 - 2019 International Conference on Robotics and Automation, ICRA 2019
AU - Xu, Binbin
AU - Li, Wenbin
AU - Tzoumanikas, Dimos
AU - Bloesch, Michael
AU - Davison, Andrew
AU - Leutenegger, Stefan
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/5
Y1 - 2019/5
N2 - We propose a new multi-instance dynamic RGBD SLAM system using an object-level octree-based volumetric representation. It can provide robust camera tracking in dynamic environments and at the same time, continuously estimate geometric, semantic, and motion properties for arbitrary objects in the scene. For each incoming frame, we perform instance segmentation to detect objects and refine mask boundaries using geometric and motion information. Meanwhile, we estimate the pose of each existing moving object using an object-oriented tracking method and robustly track the camera pose against the static scene. Based on the estimated camera pose and object poses, we associate segmented masks with existing models and incrementally fuse corresponding colour, depth, semantic, and foreground object probabilities into each object model. In contrast to existing approaches, our system is the first system to generate an object-level dynamic volumetric map from a single RGB-D camera, which can be used directly for robotic tasks. Our method can run at 2-3 Hz on a CPU, excluding the instance segmentation part. We demonstrate its effectiveness by quantitatively and qualitatively testing it on both synthetic and real-world sequences.
AB - We propose a new multi-instance dynamic RGBD SLAM system using an object-level octree-based volumetric representation. It can provide robust camera tracking in dynamic environments and at the same time, continuously estimate geometric, semantic, and motion properties for arbitrary objects in the scene. For each incoming frame, we perform instance segmentation to detect objects and refine mask boundaries using geometric and motion information. Meanwhile, we estimate the pose of each existing moving object using an object-oriented tracking method and robustly track the camera pose against the static scene. Based on the estimated camera pose and object poses, we associate segmented masks with existing models and incrementally fuse corresponding colour, depth, semantic, and foreground object probabilities into each object model. In contrast to existing approaches, our system is the first system to generate an object-level dynamic volumetric map from a single RGB-D camera, which can be used directly for robotic tasks. Our method can run at 2-3 Hz on a CPU, excluding the instance segmentation part. We demonstrate its effectiveness by quantitatively and qualitatively testing it on both synthetic and real-world sequences.
UR - http://www.scopus.com/inward/record.url?scp=85071506609&partnerID=8YFLogxK
U2 - 10.1109/ICRA.2019.8794371
DO - 10.1109/ICRA.2019.8794371
M3 - Conference contribution
AN - SCOPUS:85071506609
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 5231
EP - 5237
BT - 2019 International Conference on Robotics and Automation, ICRA 2019
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 20 May 2019 through 24 May 2019
ER -