TY - GEN
T1 - Semantic Scene Graph for Ultrasound Image Explanation and Scanning Guidance
AU - Li, Xuesong
AU - Huang, Dianye
AU - Zhang, Yameng
AU - Navab, Nassir
AU - Jiang, Zhongliang
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
PY - 2026
Y1 - 2026
N2 - Understanding medical ultrasound imaging remains a long-standing challenge due to significant visual variability caused by differences in imaging and acquisition parameters. Recent advancements in large language models (LLMs) have been used to automatically generate terminology-rich summaries orientated to clinicians with sufficient physiological knowledge. Nevertheless, the increasing demand for improved ultrasound interpretability and basic scanning guidance among non-expert users, e.g., in point-of-care settings, has not yet been explored. In this study, we first introduce the scene graph (SG) for ultrasound images to explain image content to non-expert users and provide guidance for ultrasound scanning. The ultrasound SG is first computed using a transformer-based one-stage method, eliminating the need for explicit object detection. To generate a graspable image explanation for non-expert users, the user query is then used to further refine the abstract SG representation through LLMs. Additionally, the predicted SG is explored for its potential in guiding ultrasound scanning toward missing anatomies within the current imaging view, assisting ordinary users in achieving more standardized and complete anatomical exploration. The effectiveness of this SG-based image explanation and scanning guidance has been validated on images from the left and right neck regions, including the carotid and thyroid, across five volunteers. The results demonstrate the potential of the method to maximally democratize ultrasound by enhancing its interpretability and usability for non-expert users. Project page: https://noseefood.github.io/us-scene-graph/.
AB - Understanding medical ultrasound imaging remains a long-standing challenge due to significant visual variability caused by differences in imaging and acquisition parameters. Recent advancements in large language models (LLMs) have been used to automatically generate terminology-rich summaries orientated to clinicians with sufficient physiological knowledge. Nevertheless, the increasing demand for improved ultrasound interpretability and basic scanning guidance among non-expert users, e.g., in point-of-care settings, has not yet been explored. In this study, we first introduce the scene graph (SG) for ultrasound images to explain image content to non-expert users and provide guidance for ultrasound scanning. The ultrasound SG is first computed using a transformer-based one-stage method, eliminating the need for explicit object detection. To generate a graspable image explanation for non-expert users, the user query is then used to further refine the abstract SG representation through LLMs. Additionally, the predicted SG is explored for its potential in guiding ultrasound scanning toward missing anatomies within the current imaging view, assisting ordinary users in achieving more standardized and complete anatomical exploration. The effectiveness of this SG-based image explanation and scanning guidance has been validated on images from the left and right neck regions, including the carotid and thyroid, across five volunteers. The results demonstrate the potential of the method to maximally democratize ultrasound by enhancing its interpretability and usability for non-expert users. Project page: https://noseefood.github.io/us-scene-graph/.
KW - Point-of-Care Ultrasound
KW - Scene Graph
KW - Ultrasound Image Analysis
UR - https://www.scopus.com/pages/publications/105017958855
U2 - 10.1007/978-3-032-05114-1_48
DO - 10.1007/978-3-032-05114-1_48
M3 - Conference contribution
AN - SCOPUS:105017958855
SN - 9783032051134
T3 - Lecture Notes in Computer Science
SP - 500
EP - 510
BT - Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - 28th International Conference, 2025, Proceedings
A2 - Gee, James C.
A2 - Hong, Jaesung
A2 - Sudre, Carole H.
A2 - Golland, Polina
A2 - Park, Jinah
A2 - Alexander, Daniel C.
A2 - Iglesias, Juan Eugenio
A2 - Venkataraman, Archana
A2 - Kim, Jong Hyo
PB - Springer Science and Business Media Deutschland GmbH
T2 - 28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025
Y2 - 23 September 2025 through 27 September 2025
ER -