TY - JOUR
T1 - Learning 3D semantic scene graphs from 3D indoor reconstructions
AU - Wald, Johanna
AU - Dhamo, Helisa
AU - Navab, Nassir
AU - Tombari, Federico
N1 - Publisher Copyright:
© 2020 IEEE Computer Society. All rights reserved.
PY - 2020
Y1 - 2020
N2 - Scene understanding has been of high interest in computer vision. It encompasses not only identifying objects in a scene, but also their relationships within the given context. With this goal, a recent line of works tackles 3D semantic segmentation and scene layout prediction. In our work we focus on scene graphs, a data structure that organizes the entities of a scene in a graph, where objects are nodes and their relationships modeled as edges. We leverage inference on scene graphs as a way to carry out 3D scene understanding, mapping objects and their relationships. In particular, we propose a learned method that regresses a scene graph from the point cloud of a scene. Our novel architecture is based on PointNet and Graph Convolutional Networks (GCN). In addition, we introduce 3DSSG, a semi-automatically generated dataset, that contains semantically rich scene graphs of 3D scenes. We show the application of our method in a domain-agnostic retrieval task, where graphs serve as an intermediate representation for 3D-3D and 2D-3D matching.
AB - Scene understanding has been of high interest in computer vision. It encompasses not only identifying objects in a scene, but also their relationships within the given context. With this goal, a recent line of works tackles 3D semantic segmentation and scene layout prediction. In our work we focus on scene graphs, a data structure that organizes the entities of a scene in a graph, where objects are nodes and their relationships modeled as edges. We leverage inference on scene graphs as a way to carry out 3D scene understanding, mapping objects and their relationships. In particular, we propose a learned method that regresses a scene graph from the point cloud of a scene. Our novel architecture is based on PointNet and Graph Convolutional Networks (GCN). In addition, we introduce 3DSSG, a semi-automatically generated dataset, that contains semantically rich scene graphs of 3D scenes. We show the application of our method in a domain-agnostic retrieval task, where graphs serve as an intermediate representation for 3D-3D and 2D-3D matching.
UR - http://www.scopus.com/inward/record.url?scp=85094318354&partnerID=8YFLogxK
U2 - 10.1109/CVPR42600.2020.00402
DO - 10.1109/CVPR42600.2020.00402
M3 - Conference article
AN - SCOPUS:85094318354
SN - 1063-6919
SP - 3960
EP - 3969
JO - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
JF - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
M1 - 9156565
T2 - 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020
Y2 - 14 June 2020 through 19 June 2020
ER -