TY - GEN
T1 - HouseCat6D - A Large-Scale Multi-Modal Category Level 6D Object Perception Dataset with Household Objects in Realistic Scenarios
AU - Jung, Hyun Jun
AU - Wu, Shun Cheng
AU - Ruhkamp, Patrick
AU - Zhai, Guangyao
AU - Schieber, Hannah
AU - Rizzoli, Giulia
AU - Wang, Pengyuan
AU - Zhao, Hongcheng
AU - Garattoni, Lorenzo
AU - Roth, Daniel
AU - Meier, Sven
AU - Navab, Nassir
AU - Busam, Benjamin
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Estimating 6D object poses is a major challenge in 3D computer vision. Building on successful instance-level approaches, research is shifting towards category-level pose estimation for practical applications. Current category-level datasets, however, fall short in annotation quality and pose variety. Addressing this, we introduce HouseCat6D, a new category-level 6D pose dataset. It features 1) multi-modality with Polarimetric RGB and Depth (RGBD+P), 2) encompasses 194 diverse objects across 10 household cat-egories, including two photometrically challenging ones, and 3) provides high-quality pose annotations with an error range of only 1.35 mm to 1.74 mm. The dataset also includes 4) 41 large-scale scenes with comprehensive view-point and occlusion coverage,5) a checkerboard-free en-vironment, and 6) dense 6D parallel-jaw robotic grasp annotations. Additionally, we present benchmark results for leading category-level pose estimation networks.
AB - Estimating 6D object poses is a major challenge in 3D computer vision. Building on successful instance-level approaches, research is shifting towards category-level pose estimation for practical applications. Current category-level datasets, however, fall short in annotation quality and pose variety. Addressing this, we introduce HouseCat6D, a new category-level 6D pose dataset. It features 1) multi-modality with Polarimetric RGB and Depth (RGBD+P), 2) encompasses 194 diverse objects across 10 household cat-egories, including two photometrically challenging ones, and 3) provides high-quality pose annotations with an error range of only 1.35 mm to 1.74 mm. The dataset also includes 4) 41 large-scale scenes with comprehensive view-point and occlusion coverage,5) a checkerboard-free en-vironment, and 6) dense 6D parallel-jaw robotic grasp annotations. Additionally, we present benchmark results for leading category-level pose estimation networks.
KW - 6D Pose Labels
KW - Accurate 3D Data Acquisition
KW - Category Level 6D Pose Estimation
KW - Dataset
KW - Photometrically Challenging Objects
KW - Robotic Grasping
KW - Robotic Manipulation
UR - http://www.scopus.com/inward/record.url?scp=85202284879&partnerID=8YFLogxK
U2 - 10.1109/CVPR52733.2024.02123
DO - 10.1109/CVPR52733.2024.02123
M3 - Conference contribution
AN - SCOPUS:85202284879
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 22498
EP - 22508
BT - Proceedings - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024
PB - IEEE Computer Society
T2 - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024
Y2 - 16 June 2024 through 22 June 2024
ER -