Joint prediction of monocular depth and structure using planar and parallax geometry

Hao Xing, Yifan Cao, Maximilian Biber, Mingchuan Zhou, Darius Burschka

Research output: Contribution to journalArticlepeer-review

7 Scopus citations


Supervised learning depth estimation methods can achieve good performance when trained on high-quality ground-truth, like LiDAR data. However, LiDAR can only generate sparse 3D maps which causes losing information. Obtaining high-quality ground-truth depth data per pixel is difficult to acquire. In order to overcome this limitation, we propose a novel approach combining structure information from a promising Plane and Parallax geometry pipeline with depth information into a U-Net supervised learning network, which results in quantitative and qualitative improvement compared to existing popular learning-based methods. In particular, the model is evaluated on two large-scale and challenging datasets: KITTI Vision Benchmark and Cityscapes dataset and achieve the best performance in terms of relative error. Compared with pure depth supervision models, our model has impressive performance on depth prediction of thin objects and edges, and compared to structure prediction baseline, our model performs more robustly.

Original languageEnglish
Article number108806
JournalPattern Recognition
StatePublished - Oct 2022


  • Joint prediction model
  • Monocular depth estimation
  • Plane and parallax geometry
  • Structure information


Dive into the research topics of 'Joint prediction of monocular depth and structure using planar and parallax geometry'. Together they form a unique fingerprint.

Cite this