TY - GEN
T1 - Dynamic Scene Graph Representation for Surgical Video
AU - Holm, Felix
AU - Ghazaei, Ghazal
AU - Czempiel, Tobias
AU - Özsoy, Ege
AU - Saur, Stefan
AU - Navab, Nassir
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Surgical videos captured from microscopic or endoscopic imaging devices are rich but complex sources of information, depicting different tools and anatomical structures utilized during an extended amount of time. Despite containing crucial workflow information and being commonly recorded in many procedures, usage of surgical videos for automated surgical workflow understanding is still limited. In this work, we exploit scene graphs as a more holistic, semantically meaningful and human-readable way to represent surgical videos while encoding all anatomical structures, tools, and their interactions. To properly evaluate the impact of our solutions, we create a scene graph dataset from semantic segmentations from the CaDIS and CATARACTS datasets. We demonstrate that scene graphs can be leveraged through the use of graph convolutional networks (GCNs) to tackle surgical downstream tasks such as surgical workflow recognition with competitive performance. Moreover, we demonstrate the benefits of surgical scene graphs regarding the explainability and robustness of model decisions, which are crucial in the clinical setting.
AB - Surgical videos captured from microscopic or endoscopic imaging devices are rich but complex sources of information, depicting different tools and anatomical structures utilized during an extended amount of time. Despite containing crucial workflow information and being commonly recorded in many procedures, usage of surgical videos for automated surgical workflow understanding is still limited. In this work, we exploit scene graphs as a more holistic, semantically meaningful and human-readable way to represent surgical videos while encoding all anatomical structures, tools, and their interactions. To properly evaluate the impact of our solutions, we create a scene graph dataset from semantic segmentations from the CaDIS and CATARACTS datasets. We demonstrate that scene graphs can be leveraged through the use of graph convolutional networks (GCNs) to tackle surgical downstream tasks such as surgical workflow recognition with competitive performance. Moreover, we demonstrate the benefits of surgical scene graphs regarding the explainability and robustness of model decisions, which are crucial in the clinical setting.
KW - Cataract Surgery
KW - Dynamic Scene Graph
KW - Surgical Video
UR - http://www.scopus.com/inward/record.url?scp=85182945563&partnerID=8YFLogxK
U2 - 10.1109/ICCVW60793.2023.00015
DO - 10.1109/ICCVW60793.2023.00015
M3 - Conference contribution
AN - SCOPUS:85182945563
T3 - Proceedings - 2023 IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2023
SP - 81
EP - 87
BT - Proceedings - 2023 IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2023
Y2 - 2 October 2023 through 6 October 2023
ER -