TY - GEN
T1 - SegmentOR
T2 - 26th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2023
AU - Bastian, Lennart
AU - Derkacz-Bogner, Daniel
AU - Wang, Tony D.
AU - Busam, Benjamin
AU - Navab, Nassir
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023.
PY - 2023
Y1 - 2023
N2 - The digitization of surgical operating rooms (OR) has gained significant traction in the scientific and medical communities. However, existing deep-learning methods for operating room recognition tasks still require substantial quantities of annotated data. In this paper, we introduce a method for weakly-supervised semantic segmentation for surgical operating rooms. Our method operates directly on 4D point cloud sequences from multiple ceiling-mounted RGB-D sensors and requires less than 0.01% of annotated data. This is achieved by incorporating a self-supervised temporal prior, enforcing semantic consistency in 4D point cloud video recordings. We show how refining these priors with learned semantic features can increase segmentation mIoU to 10 % above existing works, achieving higher segmentation scores than baselines that use four times the number of labels. Furthermore, the 3D semantic predictions from our method can be projected back into 2D images; we establish that these 2D predictions can be used to improve the performance of existing surgical phase recognition methods. Our method shows promise in automating 3D OR segmentation with a 20 times lower annotation cost than existing methods, demonstrating the potential to improve surgical scene understanding systems.
AB - The digitization of surgical operating rooms (OR) has gained significant traction in the scientific and medical communities. However, existing deep-learning methods for operating room recognition tasks still require substantial quantities of annotated data. In this paper, we introduce a method for weakly-supervised semantic segmentation for surgical operating rooms. Our method operates directly on 4D point cloud sequences from multiple ceiling-mounted RGB-D sensors and requires less than 0.01% of annotated data. This is achieved by incorporating a self-supervised temporal prior, enforcing semantic consistency in 4D point cloud video recordings. We show how refining these priors with learned semantic features can increase segmentation mIoU to 10 % above existing works, achieving higher segmentation scores than baselines that use four times the number of labels. Furthermore, the 3D semantic predictions from our method can be projected back into 2D images; we establish that these 2D predictions can be used to improve the performance of existing surgical phase recognition methods. Our method shows promise in automating 3D OR segmentation with a 20 times lower annotation cost than existing methods, demonstrating the potential to improve surgical scene understanding systems.
KW - Context-aware Systems
KW - Surgical Data Science
KW - Surgical Phase Recognition
KW - Surgical Scene Understanding
UR - http://www.scopus.com/inward/record.url?scp=85174686527&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-43996-4_6
DO - 10.1007/978-3-031-43996-4_6
M3 - Conference contribution
AN - SCOPUS:85174686527
SN - 9783031439957
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 57
EP - 67
BT - Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 - 26th International Conference, Proceedings
A2 - Greenspan, Hayit
A2 - Greenspan, Hayit
A2 - Madabhushi, Anant
A2 - Mousavi, Parvin
A2 - Salcudean, Septimiu
A2 - Duncan, James
A2 - Syeda-Mahmood, Tanveer
A2 - Taylor, Russell
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 8 October 2023 through 12 October 2023
ER -