TY - GEN
T1 - KidneyDepth
T2 - 28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025
AU - Oliva-Maza, Laura
AU - Steidle, Florian
AU - Klodmann, Julian
AU - Strobl, Klaus
AU - Miernik, Arkadiusz
AU - Triebel, Rudolph
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
PY - 2026
Y1 - 2026
N2 - Monocular Metric Depth Estimation (MDE) in endoscopic images is a crucial step to improve navigation during medical procedures, as it enables the estimation of dense, real-scale 3D maps of the organs. For instance, in monocular flexible ureteroscopy (fURS), accurate navigation and real-scale information are essential for locating and removing kidney stones efficiently. Currently, the most promising approach to infer depth from single passive cameras is by supervised training of large neural networks, so-called foundation models for MDE. However, the depth output of these models is biased when the training data domain does not fit the goal domain (both camera and scene). At the same time, one of the greatest challenges in medical imaging is the lack of annotated datasets, as obtaining real ground-truth (e.g., depth data) is difficult. To overcome this, simulation has become a valuable tool in ureteroscopic imaging research. In this study, we introduce KidneyDepth, a synthetic dataset designed to reduce the gap between simulated and real-world 3D imaging. It includes a variety of shapes (e.g. mesh from CT scan, geometric primitive forms) along with different textures and lighting conditions, generated by BlenderProc2 [7]. To assess the effectiveness of KidneyDepth, we fine-tune two state-of-the-art MDE models (Depth Anything V2 and ZoeDepth) and test their performance on both simulated and real ureteroscopic images. Additionally, we evaluate the validity of their output by using the inferred depths in the context of a RGB-D SLAM system. Our results show that training models on a synthetic dataset with diverse structures and lighting conditions improves depth estimation in real endoscopic images and our simulations show that these RGB-D images enhance overall SLAM accuracy. The KidneyDepth dataset can be found at https://zenodo.org/records/14893421.
AB - Monocular Metric Depth Estimation (MDE) in endoscopic images is a crucial step to improve navigation during medical procedures, as it enables the estimation of dense, real-scale 3D maps of the organs. For instance, in monocular flexible ureteroscopy (fURS), accurate navigation and real-scale information are essential for locating and removing kidney stones efficiently. Currently, the most promising approach to infer depth from single passive cameras is by supervised training of large neural networks, so-called foundation models for MDE. However, the depth output of these models is biased when the training data domain does not fit the goal domain (both camera and scene). At the same time, one of the greatest challenges in medical imaging is the lack of annotated datasets, as obtaining real ground-truth (e.g., depth data) is difficult. To overcome this, simulation has become a valuable tool in ureteroscopic imaging research. In this study, we introduce KidneyDepth, a synthetic dataset designed to reduce the gap between simulated and real-world 3D imaging. It includes a variety of shapes (e.g. mesh from CT scan, geometric primitive forms) along with different textures and lighting conditions, generated by BlenderProc2 [7]. To assess the effectiveness of KidneyDepth, we fine-tune two state-of-the-art MDE models (Depth Anything V2 and ZoeDepth) and test their performance on both simulated and real ureteroscopic images. Additionally, we evaluate the validity of their output by using the inferred depths in the context of a RGB-D SLAM system. Our results show that training models on a synthetic dataset with diverse structures and lighting conditions improves depth estimation in real endoscopic images and our simulations show that these RGB-D images enhance overall SLAM accuracy. The KidneyDepth dataset can be found at https://zenodo.org/records/14893421.
KW - Dataset
KW - Monocular metric depth
KW - Navigation
KW - Ureteroscopy
UR - https://www.scopus.com/pages/publications/105017964348
U2 - 10.1007/978-3-032-05114-1_32
DO - 10.1007/978-3-032-05114-1_32
M3 - Conference contribution
AN - SCOPUS:105017964348
SN - 9783032051134
T3 - Lecture Notes in Computer Science
SP - 331
EP - 340
BT - Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - 28th International Conference, 2025, Proceedings
A2 - Gee, James C.
A2 - Hong, Jaesung
A2 - Sudre, Carole H.
A2 - Golland, Polina
A2 - Park, Jinah
A2 - Alexander, Daniel C.
A2 - Iglesias, Juan Eugenio
A2 - Venkataraman, Archana
A2 - Kim, Jong Hyo
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 23 September 2025 through 27 September 2025
ER -