TY - GEN
T1 - SemanticPaint
T2 - International Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2015
AU - Valentin, Julien
AU - Kohli, Pushmeet
AU - Vineet, Vibhav
AU - Nießner, Matthias
AU - Cheng, Ming Ming
AU - Criminisi, Antonio
AU - Kim, David
AU - Izadi, Shahram
AU - Shotton, Jamie
AU - Torr, Philip
PY - 2015/7/31
Y1 - 2015/7/31
N2 - We present a real-time, interactive system for the geometric reconstruction, object-class segmentation and learning of 3D scenes [Valentin et al.]. Using our system, a user can walk into a room wearing a consumer depth camera and a virtual reality headset, and both densely reconstruct the 3D scene [Niessner et al. 2013]) and interactively segment the environment into object classes such as 'chair', 'floor' and 'table'. The user interacts physically with the real-world scene, touching or pointing at objects and using voice commands to assign them appropriate labels. These user generated labels are leveraged by a new online random forest-based machine learning algorithm, which is used to predict labels for previously unseen parts of the scene. The predicted labels, together with those provided directly by the user, are incorporated into a dense 3D conditional random field model, over which we perform mean-field inference to filter out label inconsistencies. The entire pipeline runs in real time, and the user stays 'in the loop' throughout the process, receiving immediate feedback about the progress of the labelling and interacting with the scene as necessary to refine the predicted segmentation.
AB - We present a real-time, interactive system for the geometric reconstruction, object-class segmentation and learning of 3D scenes [Valentin et al.]. Using our system, a user can walk into a room wearing a consumer depth camera and a virtual reality headset, and both densely reconstruct the 3D scene [Niessner et al. 2013]) and interactively segment the environment into object classes such as 'chair', 'floor' and 'table'. The user interacts physically with the real-world scene, touching or pointing at objects and using voice commands to assign them appropriate labels. These user generated labels are leveraged by a new online random forest-based machine learning algorithm, which is used to predict labels for previously unseen parts of the scene. The predicted labels, together with those provided directly by the user, are incorporated into a dense 3D conditional random field model, over which we perform mean-field inference to filter out label inconsistencies. The entire pipeline runs in real time, and the user stays 'in the loop' throughout the process, receiving immediate feedback about the progress of the labelling and interacting with the scene as necessary to refine the predicted segmentation.
UR - http://www.scopus.com/inward/record.url?scp=84957940038&partnerID=8YFLogxK
U2 - 10.1145/2775280.2792589
DO - 10.1145/2775280.2792589
M3 - Conference contribution
AN - SCOPUS:84957940038
T3 - ACM SIGGRAPH 2015 Talks, SIGGRAPH 2015
BT - ACM SIGGRAPH 2015 Talks, SIGGRAPH 2015
PB - Association for Computing Machinery, Inc
Y2 - 9 August 2015 through 13 August 2015
ER -