Multi-view KPConv for enhanced 3D point cloud semantic segmentation using multi-modal fusion with 2D images

C. Du, M. A. Vega, Y. Pan, A. Borrmann

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Compared with unimodal deep learning algorithms that directly process 3D point clouds, multimodal fusion algorithms that leverage 2D images as supplementary information have performance advantages. In this work, the performance of an open-source multimodal algorithm, MVPNet, is improved on the 3D semantic segmentation task by using KPConv as a more robust 3D backbone. Different modules of the two networks are meaningfully combined: the 2D-3D lifting method provided by MVPNet aggregates selected 2D image features into 3D point clouds, then KPConv is used to fuse these features with geometric information to make predictions. On a ScanNet sub dataset, the proposed network significantly outperforms the original MVPNet and KPConv regardless of the fusion structure. By integrating COLMAP into the workflow, we further extend the proposed method to a custom dataset. The results show the improved performance of our multimodal fusion algorithm in identifying relevant categories of objects in the 3D scene.

Original languageEnglish
Title of host publicationeWork and eBusiness in Architecture, Engineering and Construction - Proceedings of the 14th European Conference on Product and Process Modelling, ECPPM 2022
EditorsEilif Hjelseth, Sujesh F. Sujan, Raimar J. Scherer
PublisherCRC Press/Balkema
Pages527-534
Number of pages8
ISBN (Print)9781032406732
DOIs
StatePublished - 2023
Event14th European Conference on Product and Process Modelling, ECPPM 2022 - Trondheim, Norway
Duration: 14 Sep 202216 Sep 2022

Publication series

NameeWork and eBusiness in Architecture, Engineering and Construction - Proceedings of the 14th European Conference on Product and Process Modelling, ECPPM 2022

Conference

Conference14th European Conference on Product and Process Modelling, ECPPM 2022
Country/TerritoryNorway
CityTrondheim
Period14/09/2216/09/22

Fingerprint

Dive into the research topics of 'Multi-view KPConv for enhanced 3D point cloud semantic segmentation using multi-modal fusion with 2D images'. Together they form a unique fingerprint.

Cite this