TY - JOUR
T1 - 3DCentripetalNet
T2 - Building height retrieval from monocular remote sensing imagery
AU - Li, Qingyu
AU - Mou, Lichao
AU - Hua, Yuansheng
AU - Shi, Yilei
AU - Chen, Sining
AU - Sun, Yao
AU - Zhu, Xiao Xiang
N1 - Publisher Copyright:
© 2023 The Author(s)
PY - 2023/6
Y1 - 2023/6
N2 - Three-dimensional (3D) building structures are vital to understanding urban dynamics. Monocular remote sensing imagery is a cost-effective data source for large-scale building height retrieval when compared to LiDAR data and multi-view imagery. Existing methods learn building footprints and height maps per pixel via either a multi-task network or two separate networks, however, failing to consider the information of neighboring pixels that belong to the identical building. Therefore, we propose learning a novel representation for 3D buildings, namely 3D centripetal shifts, a unified representation of individual building instances. Our method is termed as 3DCentripetalNet and learns the 3D centripetal shift representation that incorporates planar and vertical structures of buildings. Afterward, a decoupling module is devised to learn building corner points. Finally, a 3D modeling module is designed to retrieve building height from the learned 3D centripetal shift map and corner points. We investigate the proposed 3DCentripetalNet on two datasets with different spatial resolutions, i.e., the ISPRS Vaihingen dataset (9 cm/pixel) and the Urban 3D dataset (50 cm/pixel). Experimental results suggest that 3DCentripetalNet is able to preserve sharp building boundaries, largely alleviate false detections, and significantly outperform other competitors. Thus, we believe that 3DCentripetalNet is a robust solution for the task of building height retrieval from monocular imagery.
AB - Three-dimensional (3D) building structures are vital to understanding urban dynamics. Monocular remote sensing imagery is a cost-effective data source for large-scale building height retrieval when compared to LiDAR data and multi-view imagery. Existing methods learn building footprints and height maps per pixel via either a multi-task network or two separate networks, however, failing to consider the information of neighboring pixels that belong to the identical building. Therefore, we propose learning a novel representation for 3D buildings, namely 3D centripetal shifts, a unified representation of individual building instances. Our method is termed as 3DCentripetalNet and learns the 3D centripetal shift representation that incorporates planar and vertical structures of buildings. Afterward, a decoupling module is devised to learn building corner points. Finally, a 3D modeling module is designed to retrieve building height from the learned 3D centripetal shift map and corner points. We investigate the proposed 3DCentripetalNet on two datasets with different spatial resolutions, i.e., the ISPRS Vaihingen dataset (9 cm/pixel) and the Urban 3D dataset (50 cm/pixel). Experimental results suggest that 3DCentripetalNet is able to preserve sharp building boundaries, largely alleviate false detections, and significantly outperform other competitors. Thus, we believe that 3DCentripetalNet is a robust solution for the task of building height retrieval from monocular imagery.
KW - Building footprint generation
KW - Building height retrieval
KW - Height estimation
KW - Monocular imagery
UR - http://www.scopus.com/inward/record.url?scp=85159434760&partnerID=8YFLogxK
U2 - 10.1016/j.jag.2023.103311
DO - 10.1016/j.jag.2023.103311
M3 - Review article
AN - SCOPUS:85159434760
SN - 1569-8432
VL - 120
JO - International Journal of Applied Earth Observation and Geoinformation
JF - International Journal of Applied Earth Observation and Geoinformation
M1 - 103311
ER -