Abstract
We present a novel approach that converts partial and noisy RGB-D scans into high-quality 3D scene reconstructions by inferring unobserved scene geometry. Our approach is fully self-supervised and can hence be trained solely on incomplete, real-world scans. To achieve, self-supervision, we remove frames from a given (incomplete) 3D scan in order to make it even more incomplete; self-supervision is then formulated by correlating the two levels of partialness of the same scan while masking out regions that have never been observed. Through generalization across a large training set, we can then predict 3D scene completions even without seeing any 3D scan of entirely complete geometry. Combined with a new 3D sparse generative convolutional neural network architecture, our method is able to predict highly detailed surfaces in a coarse-to-fine hierarchical fashion that outperform existing state-of-the-art methods by a significant margin in terms of reconstruction quality.
Original language | English |
---|---|
Article number | 9157644 |
Pages (from-to) | 846-855 |
Number of pages | 10 |
Journal | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition |
DOIs | |
State | Published - 2020 |
Event | 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020 - Virtual, Online, United States Duration: 14 Jun 2020 → 19 Jun 2020 |