Skip to main navigation Skip to search Skip to main content

Long-Horizon Language-Conditioned Imitation Learning for Robotic Manipulation

  • Xiangtong Yao
  • , Tobias Blei
  • , Yuan Meng
  • , Yu Zhang
  • , Hongkuan Zhou
  • , Zhenshan Bing
  • , Kai Huang
  • , Fuchun Sun
  • , Alois Knoll
  • Technical University of Munich
  • Robert Bosch GmbH
  • Nanjing University
  • Sun Yat-Sen University
  • Tsinghua University

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Language-controlled policies enable robots to follow human language instructions and execute complex tasks. While language-conditioned imitation learning has proven effective in teaching robots to perform tasks guided by language instructions, it faces multiple challenges due to the multimodal nature of human demonstrations and limited training data. The variability in demonstrations can complicate policy learning, as the same instruction may correspond to diverse actions. To mitigate these issues, we propose an end-to-end transformer-based policy, predicting categorical distributions over a discretized action space. By discretizing the action space and employing autoregressive sampling, our model efficiently handles the exponential growth of high-dimensional discrete action spaces, allowing it to learn complex action distributions effectively. In addition, we apply data augmentation techniques to reuse existing data more effectively and implement an action disturbance strategy to enhance the model's generalization capabilities. Furthermore, we employ a cotraining strategy to leverage data that lacks language annotations. The effectiveness of our approach is demonstrated through simulation and real-world experiments on a robot manipulator in a long-horizon, language-conditioned setting, including multiple environments and zero-shot transferring to real-world settings.

Original languageEnglish
Pages (from-to)5628-5639
Number of pages12
JournalIEEE/ASME Transactions on Mechatronics
Volume30
Issue number6
DOIs
StatePublished - 2025

Keywords

  • Imitation learning
  • language-controlled robotics
  • long-horizon task learning

Fingerprint

Dive into the research topics of 'Long-Horizon Language-Conditioned Imitation Learning for Robotic Manipulation'. Together they form a unique fingerprint.

Cite this