TY - GEN
T1 - Object-RPE
T2 - 2019 European Conference on Mobile Robots, ECMR 2019
AU - Hoang, Dinh Cuong
AU - Stoyanov, Todor
AU - Lilienthal, Achim J.
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/9
Y1 - 2019/9
N2 - We present a system for accurate 3D instance-aware semantic reconstruction and 6D pose estimation, using an RGB-D camera. Our framework couples convolutional neural networks (CNNs) and a state-of-the-art dense Simultaneous Localisation and Mapping (SLAM) system, ElasticFusion, to achieve both high-quality semantic reconstruction as well as robust 6D pose estimation for relevant objects. The method presented in this paper extends a high-quality instance-aware semantic 3D Mapping system from previous work [1] by adding a 6D object pose estimator. While the main trend in CNN-based 6D pose estimation has been to infer object's position and orientation from single views of the scene, our approach explores performing pose estimation from multiple viewpoints, under the conjecture that combining multiple predictions can improve the robustness of an object detection system. The resulting system is capable of producing high-quality object-aware semantic reconstructions of room-sized environments, as well as accurately detecting objects and their 6D poses. The developed method has been verified through experimental validation on the YCB-Video dataset and a newly collected warehouse object dataset. Experimental results confirmed that the proposed system achieves improvements over state-of-the-art methods in terms of surface reconstruction and object pose prediction. Our code and video are available at https://sites.google.com/view/object-rpe.
AB - We present a system for accurate 3D instance-aware semantic reconstruction and 6D pose estimation, using an RGB-D camera. Our framework couples convolutional neural networks (CNNs) and a state-of-the-art dense Simultaneous Localisation and Mapping (SLAM) system, ElasticFusion, to achieve both high-quality semantic reconstruction as well as robust 6D pose estimation for relevant objects. The method presented in this paper extends a high-quality instance-aware semantic 3D Mapping system from previous work [1] by adding a 6D object pose estimator. While the main trend in CNN-based 6D pose estimation has been to infer object's position and orientation from single views of the scene, our approach explores performing pose estimation from multiple viewpoints, under the conjecture that combining multiple predictions can improve the robustness of an object detection system. The resulting system is capable of producing high-quality object-aware semantic reconstructions of room-sized environments, as well as accurately detecting objects and their 6D poses. The developed method has been verified through experimental validation on the YCB-Video dataset and a newly collected warehouse object dataset. Experimental results confirmed that the proposed system achieves improvements over state-of-the-art methods in terms of surface reconstruction and object pose prediction. Our code and video are available at https://sites.google.com/view/object-rpe.
UR - http://www.scopus.com/inward/record.url?scp=85074398548&partnerID=8YFLogxK
U2 - 10.1109/ECMR.2019.8870927
DO - 10.1109/ECMR.2019.8870927
M3 - Conference contribution
AN - SCOPUS:85074398548
T3 - 2019 European Conference on Mobile Robots, ECMR 2019 - Proceedings
BT - 2019 European Conference on Mobile Robots, ECMR 2019 - Proceedings
A2 - Preucil, Libor
A2 - Behnke, Sven
A2 - Kulich, Miroslav
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 4 September 2019 through 6 September 2019
ER -