TY - JOUR
T1 - Patch2CAD
T2 - 18th IEEE/CVF International Conference on Computer Vision, ICCV 2021
AU - Kuo, Weicheng
AU - Angelova, Anelia
AU - Lin, Tsung Yi
AU - Dai, Angela
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - 3D perception of object shapes from RGB image input is fundamental towards semantic scene understanding, grounding image-based perception in our spatially 3-dimensional real-world environments. To achieve a mapping between image views of objects and 3D shapes, we leverage CAD model priors from existing large-scale databases, and propose a novel approach towards constructing a joint embedding space between 2D images and 3D CAD models in a patch-wise fashion - establishing correspondences between patches of an image view of an object and patches of CAD geometry. This enables part similarity reasoning for retrieving similar CADs to a new image view without exact matches in the database. Our patch embedding provides more robust CAD retrieval for shape estimation in our end-to-end estimation of CAD model shape and pose for detected objects in a single input image. Experiments on in-the-wild, complex imagery from ScanNet show that our approach is more robust than state of the art in real-world scenarios without any exact CAD matches.
AB - 3D perception of object shapes from RGB image input is fundamental towards semantic scene understanding, grounding image-based perception in our spatially 3-dimensional real-world environments. To achieve a mapping between image views of objects and 3D shapes, we leverage CAD model priors from existing large-scale databases, and propose a novel approach towards constructing a joint embedding space between 2D images and 3D CAD models in a patch-wise fashion - establishing correspondences between patches of an image view of an object and patches of CAD geometry. This enables part similarity reasoning for retrieving similar CADs to a new image view without exact matches in the database. Our patch embedding provides more robust CAD retrieval for shape estimation in our end-to-end estimation of CAD model shape and pose for detected objects in a single input image. Experiments on in-the-wild, complex imagery from ScanNet show that our approach is more robust than state of the art in real-world scenarios without any exact CAD matches.
UR - http://www.scopus.com/inward/record.url?scp=85218352844&partnerID=8YFLogxK
U2 - 10.1109/ICCV48922.2021.01236
DO - 10.1109/ICCV48922.2021.01236
M3 - Conference article
AN - SCOPUS:85218352844
SN - 1550-5499
SP - 12569
EP - 12579
JO - Proceedings of the IEEE International Conference on Computer Vision
JF - Proceedings of the IEEE International Conference on Computer Vision
Y2 - 11 October 2021 through 17 October 2021
ER -