TY - JOUR
T1 - Self-Supervised Object-in-Gripper Segmentation from Robotic Motions
AU - Boerdijk, Wout
AU - Sundermeyer, Martin
AU - Durner, Maximilian
AU - Triebel, Rudolph
N1 - Publisher Copyright:
© 2020 Proceedings of Machine Learning Research. All rights reserved.
PY - 2020
Y1 - 2020
N2 - Accurate object segmentation is a crucial task in the context of robotic manipulation. However, creating sufficient annotated training data for neural networks is particularly time consuming and often requires manual labeling. To this end, we propose a simple, yet robust solution for learning to segment unknown objects grasped by a robot. Specifically, we exploit motion and temporal cues in RGB video sequences. Using optical flow estimation we first learn to predict segmentation masks of our given manipulator. Then, these annotations are used in combination with motion cues to automatically distinguish between background, manipulator and unknown, grasped object. In contrast to existing systems our approach is fully self-supervised and independent of precise camera calibration, 3D models or potentially imperfect depth data. We perform a thorough comparison with alternative baselines and approaches from literature. The object masks and views are shown to be suitable training data for segmentation networks that generalize to novel environments and also allow for watertight 3D reconstruction.
AB - Accurate object segmentation is a crucial task in the context of robotic manipulation. However, creating sufficient annotated training data for neural networks is particularly time consuming and often requires manual labeling. To this end, we propose a simple, yet robust solution for learning to segment unknown objects grasped by a robot. Specifically, we exploit motion and temporal cues in RGB video sequences. Using optical flow estimation we first learn to predict segmentation masks of our given manipulator. Then, these annotations are used in combination with motion cues to automatically distinguish between background, manipulator and unknown, grasped object. In contrast to existing systems our approach is fully self-supervised and independent of precise camera calibration, 3D models or potentially imperfect depth data. We perform a thorough comparison with alternative baselines and approaches from literature. The object masks and views are shown to be suitable training data for segmentation networks that generalize to novel environments and also allow for watertight 3D reconstruction.
KW - Object Segmentation
KW - Self-Supervised Learning
UR - http://www.scopus.com/inward/record.url?scp=85097917212&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85097917212
SN - 2640-3498
VL - 155
SP - 1231
EP - 1245
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
T2 - 4th Conference on Robot Learning, CoRL 2020
Y2 - 16 November 2020 through 18 November 2020
ER -