TY - GEN
T1 - ImitationNet
T2 - 22nd IEEE-RAS International Conference on Humanoid Robots, Humanoids 2023
AU - Yan, Yashuai
AU - Mascaro, Esteve Valls
AU - Lee, Dongheui
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - This paper introduces a novel deep-learning approach for human-To-robot motion retargeting, enabling robots to mimic human poses accurately. Contrary to prior deep-learning-based works, our method does not require paired human-To-robot data, which facilitates its translation to new robots. First, we construct a shared latent space between humans and robots via adaptive contrastive learning that takes advantage of a proposed cross-domain similarity metric between the human and robot poses. Additionally, we propose a consistency term to build a common latent space that captures the similarity of the poses with precision while allowing direct robot motion control from the latent space. For instance, we can generate in-between motion through simple linear interpolation between two projected human poses. We conduct a comprehensive evaluation of robot control from diverse modalities (i.e., texts, RGB videos, and key poses), which facilitates robot control for non-expert users. Our model outperforms existing works regarding human-To-robot retargeting in terms of efficiency and precision. Finally, we implemented our method in a real robot with self-collision avoidance through a whole-body controller to showcase the effectiveness of our approach.
AB - This paper introduces a novel deep-learning approach for human-To-robot motion retargeting, enabling robots to mimic human poses accurately. Contrary to prior deep-learning-based works, our method does not require paired human-To-robot data, which facilitates its translation to new robots. First, we construct a shared latent space between humans and robots via adaptive contrastive learning that takes advantage of a proposed cross-domain similarity metric between the human and robot poses. Additionally, we propose a consistency term to build a common latent space that captures the similarity of the poses with precision while allowing direct robot motion control from the latent space. For instance, we can generate in-between motion through simple linear interpolation between two projected human poses. We conduct a comprehensive evaluation of robot control from diverse modalities (i.e., texts, RGB videos, and key poses), which facilitates robot control for non-expert users. Our model outperforms existing works regarding human-To-robot retargeting in terms of efficiency and precision. Finally, we implemented our method in a real robot with self-collision avoidance through a whole-body controller to showcase the effectiveness of our approach.
UR - https://www.scopus.com/pages/publications/85182943604
U2 - 10.1109/Humanoids57100.2023.10375150
DO - 10.1109/Humanoids57100.2023.10375150
M3 - Conference contribution
AN - SCOPUS:85182943604
T3 - IEEE-RAS International Conference on Humanoid Robots
BT - 2023 IEEE-RAS 22nd International Conference on Humanoid Robots, Humanoids 2023
PB - IEEE Computer Society
Y2 - 12 December 2023 through 14 December 2023
ER -