VISAGE: Video Synthesis Using Action Graphs for Surgery

Yousef Yeganeh, Rachmadio Lazuardi, Amir Shamseddin, Emine Dari, Yash Thirani, Nassir Navab, Azade Farshad

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Surgical data science (SDS) is a field that analyzes patient data before, during, and after surgery to improve surgical outcomes and skills. However, surgical data is scarce, heterogeneous, and complex, which limits the applicability of existing machine learning methods. In this work, we introduce the novel task of future video generation in laparoscopic surgery. This task can augment and enrich the existing surgical data and enable various applications, such as simulation, analysis, and robot-aided surgery. Ultimately, it involves not only understanding the current state of the operation but also accurately predicting the dynamic and often unpredictable nature of surgical procedures. Our proposed method, VISAGE (VIdeo Synthesis using Action Graphs for Surgery), leverages the power of action scene graphs to capture the sequential nature of laparoscopic procedures and utilizes diffusion models to synthesize temporally coherent video sequences. VISAGE predicts the future frames given only a single initial frame, and the action graph triplets. By incorporating domain-specific knowledge through the action graph, VISAGE ensures the generated videos adhere to the expected visual and motion patterns observed in real laparoscopic procedures. The results of our experiments demonstrate high-fidelity video generation for laparoscopy procedures, which enables various applications in SDS.

Original languageEnglish
Title of host publicationMedical Image Computing and Computer Assisted Intervention – MICCAI 2024 Workshops - ISIC 2024, iMIMIC 2024, EARTH 2024, DeCaF 2024, Held in Conjunction with MICCAI 2024, Proceedings
EditorsM. Emre Celebi, Mauricio Reyes, Zhen Chen, Xiaoxiao Li
PublisherSpringer Science and Business Media Deutschland GmbH
Pages146-156
Number of pages11
ISBN (Print)9783031776090
DOIs
StatePublished - 2025
Event9th International Skin Imaging Collaboration Workshop, ISIC 2024, 7th International Workshop on Interpretability of Machine Intelligence in Medical Image Computing, iMIMIC 2024, Embodied AI and Robotics for HealTHcare Workshop, EARTH 2024 and 5th MICCAI Workshop on Distributed, Collaborative and Federated Learning, DeCaF 2024 held at 27th International conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2024 - Marrakesh, Morocco
Duration: 6 Oct 202410 Oct 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15274 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference9th International Skin Imaging Collaboration Workshop, ISIC 2024, 7th International Workshop on Interpretability of Machine Intelligence in Medical Image Computing, iMIMIC 2024, Embodied AI and Robotics for HealTHcare Workshop, EARTH 2024 and 5th MICCAI Workshop on Distributed, Collaborative and Federated Learning, DeCaF 2024 held at 27th International conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2024
Country/TerritoryMorocco
CityMarrakesh
Period6/10/2410/10/24

Keywords

  • Diffusion Models
  • Surgical Data Science
  • Surgical Scene Graphs
  • Surgical Video Synthesis

Fingerprint

Dive into the research topics of 'VISAGE: Video Synthesis Using Action Graphs for Surgery'. Together they form a unique fingerprint.

Cite this