ScanNet: Richly-annotated 3D reconstructions of indoor scenes

Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Nießner

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1848 Scopus citations

Abstract

A key requirement for leveraging supervised deep learning methods is the availability of large, labeled datasets. Unfortunately, in the context of RGB-D scene understanding, very little data is available - current datasets cover a small range of scene views and have limited semantic annotations. To address this issue, we introduce ScanNet, an RGB-D video dataset containing 2.5M views in 1513 scenes annotated with 3D camera poses, surface reconstructions, and semantic segmentations. To collect this data, we designed an easy-to-use and scalable RGB-D capture system that includes automated surface reconstruction and crowd-sourced semantic annotation. We show that using this data helps achieve state-of-the-art performance on several 3D scene understanding tasks, including 3D object classification, semantic voxel labeling, and CAD model retrieval.

Original languageEnglish
Title of host publicationProceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2432-2443
Number of pages12
ISBN (Electronic)9781538604571
DOIs
StatePublished - 6 Nov 2017
Event30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017 - Honolulu, United States
Duration: 21 Jul 201726 Jul 2017

Publication series

NameProceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017
Volume2017-January

Conference

Conference30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017
Country/TerritoryUnited States
CityHonolulu
Period21/07/1726/07/17

Fingerprint

Dive into the research topics of 'ScanNet: Richly-annotated 3D reconstructions of indoor scenes'. Together they form a unique fingerprint.

Cite this