TY - JOUR
T1 - Communication System Using Natural Language for Robotic Laparoscope Guidance
AU - Pan, Yan
AU - Bernhard, Lukas
AU - Fan, Cheng
AU - Beckendorf, Lukas
AU - Wilhelm, Dirk
AU - Feußner, Hubertus
AU - Groh, Georg
N1 - Publisher Copyright:
© 2024 by Walter de Gruyter Berlin/Boston.
PY - 2024/9/14
Y1 - 2024/9/14
N2 - To help with the critical nurse staffing shortages in hospitals worldwide, robotic assistants are designed to handle frequently required tasks in the digital operating room (DOR), such as the guidance of the laparoscopic camera. To enable fluent collaboration between robots and clinicians, an intuitive and efficient communication interface is needed to allow for interaction using natural language. However, the demanding requirements of the surgical domain make it challenging to develop suitable solutions. A variety of different vocabulary or phrases may be used for expressing the same command. At the same time, surgical workflows may be highly dynamic - especially in emergency situations - and thus the system must be able to grasp the user's intent both quickly and with high accuracy. This is especially true as only some clinicians may be authorized to request certain tasks, depending on their rank or field of expertise. To solve these challenges, our proposed communication system uses the fine-tuned deep learning model to recognize the speaker information, and the robot assistant takes action only when it detects the commands from the responsible clinician. Also, our proposed conversational functions enable the finetuned large language models to understand the current natural language command given previous command history. In this work, we present a communication system to recognize the speaking person and understand the intent of conversational commands quickly and accurately.
AB - To help with the critical nurse staffing shortages in hospitals worldwide, robotic assistants are designed to handle frequently required tasks in the digital operating room (DOR), such as the guidance of the laparoscopic camera. To enable fluent collaboration between robots and clinicians, an intuitive and efficient communication interface is needed to allow for interaction using natural language. However, the demanding requirements of the surgical domain make it challenging to develop suitable solutions. A variety of different vocabulary or phrases may be used for expressing the same command. At the same time, surgical workflows may be highly dynamic - especially in emergency situations - and thus the system must be able to grasp the user's intent both quickly and with high accuracy. This is especially true as only some clinicians may be authorized to request certain tasks, depending on their rank or field of expertise. To solve these challenges, our proposed communication system uses the fine-tuned deep learning model to recognize the speaker information, and the robot assistant takes action only when it detects the commands from the responsible clinician. Also, our proposed conversational functions enable the finetuned large language models to understand the current natural language command given previous command history. In this work, we present a communication system to recognize the speaking person and understand the intent of conversational commands quickly and accurately.
KW - Communication system
KW - Large language model
KW - Natural language command
KW - Speaker recognizer
UR - http://www.scopus.com/inward/record.url?scp=85203879518&partnerID=8YFLogxK
U2 - 10.1515/cdbme-2024-1066
DO - 10.1515/cdbme-2024-1066
M3 - Article
AN - SCOPUS:85203879518
SN - 2364-5504
VL - 10
SP - 54
EP - 57
JO - Current Directions in Biomedical Engineering
JF - Current Directions in Biomedical Engineering
IS - 2
ER -