Hand Pose-based Task Learning from Visual Observations with Semantic Skill Extraction

Zeju Qiu, Thomas Eiband, Shile Li, Dongheui Lee

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Learning from Demonstrations is a promising technique to transfer task knowledge from a user to a robot. We propose a framework for task programming by observing the human hand pose and object locations solely with a depth camera. By extracting skills from the demonstrations, we are able to represent what the robot has learned, generalize to unseen object locations and optimize the robotic execution instead of replaying a non-optimal behavior. A two-staged segmentation algorithm that employs skill template matching via Hidden Markov Models has been developed to extract motion primitives from the demonstration and gives them semantic meanings. In this way, the transfer of task knowledge has been improved from a simple replay of the demonstration towards a semantically annotated, optimized and generalized execution. We evaluated the extraction of a set of skills in simulation and prove that the task execution can be optimized by such means.

Original languageEnglish
Title of host publication29th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages596-603
Number of pages8
ISBN (Electronic)9781728160757
DOIs
StatePublished - Aug 2020
Externally publishedYes
Event29th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2020 - Virtual, Naples, Italy
Duration: 31 Aug 20204 Sep 2020

Publication series

Name29th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2020

Conference

Conference29th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2020
Country/TerritoryItaly
CityVirtual, Naples
Period31/08/204/09/20

Fingerprint

Dive into the research topics of 'Hand Pose-based Task Learning from Visual Observations with Semantic Skill Extraction'. Together they form a unique fingerprint.

Cite this