TY - GEN
T1 - An integrated framework for multimodal human-robot interaction
AU - D'Haro, Luis Fernando
AU - Niculescu, Andreea I.
AU - Cai, Caixia
AU - Nair, Suraj
AU - Banchs, Rafael E.
AU - Knoll, Alois
AU - Li, Haizhou
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/7/2
Y1 - 2017/7/2
N2 - Recent research progresses in speech recognition, text-to-speech, natural language understanding, or dialog management components are improving the way humans interact with advanced robot machines. However, far from being solved, we are just starting the process of creating meaningful multimodal platforms that can allow operators to use and control industrial robots through spoken dialogue. This paper describes our ongoing efforts on creating a modular platform that combines different technologies to cover typical requirements in an industrial setting, i.e. robust speech recognition, low level skill functions to operate the robot, recommendations and validation procedures to setup parameters, combination of audio-visual information for challenging environments, integration of domain-knowledge by means of an ontology, a flexible definition of the dialog model and natural language rules, as well as a test and control interface to quickly check the functionality of each module during development and operation. All platform modules are intercommunicated by the ROS operative system which allows the integration of external plugins and modules easily. Finally, a preliminary user study with IT experts simulating a welding task has been doing giving us clues on what should be the focus of our next developments.
AB - Recent research progresses in speech recognition, text-to-speech, natural language understanding, or dialog management components are improving the way humans interact with advanced robot machines. However, far from being solved, we are just starting the process of creating meaningful multimodal platforms that can allow operators to use and control industrial robots through spoken dialogue. This paper describes our ongoing efforts on creating a modular platform that combines different technologies to cover typical requirements in an industrial setting, i.e. robust speech recognition, low level skill functions to operate the robot, recommendations and validation procedures to setup parameters, combination of audio-visual information for challenging environments, integration of domain-knowledge by means of an ontology, a flexible definition of the dialog model and natural language rules, as well as a test and control interface to quickly check the functionality of each module during development and operation. All platform modules are intercommunicated by the ROS operative system which allows the integration of external plugins and modules easily. Finally, a preliminary user study with IT experts simulating a welding task has been doing giving us clues on what should be the focus of our next developments.
UR - http://www.scopus.com/inward/record.url?scp=85050484972&partnerID=8YFLogxK
U2 - 10.1109/APSIPA.2017.8282005
DO - 10.1109/APSIPA.2017.8282005
M3 - Conference contribution
AN - SCOPUS:85050484972
T3 - Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017
SP - 76
EP - 82
BT - Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017
Y2 - 12 December 2017 through 15 December 2017
ER -