Visually Grounding Language Instruction for History-Dependent Manipulation

Hyemin Ahn, Obin Kwon, Kyungdo Kim, Jaeyeon Jeong, Howoong Jun, Hongjung Lee, Dongheui Lee, Songhwai Oh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

This paper emphasizes the importance of a robot's ability to refer to its task history, especially when it exe-cutes a series of pick-and-place manipulations by following language instructions given one by one. The advantage of referring to the manipulation history can be categorized into two folds: (1) the language instructions omitting details but using expressions referring to the past can be interpreted, and (2) the visual information of objects occluded by previous manipulations can be inferred. For this, we introduce a history-dependent manipulation task which objective is to visually ground a series of language instructions for proper pick-and-place manipulations by referring to the past. We also suggest a relevant dataset and model which can be a baseline, and show that our model trained with the proposed dataset can also be applied to the real world based on the CycleGAN. Our dataset and code are publicly available on the project website: https://sites.google.com/view/history-dependent-manipulation.

Original languageEnglish
Title of host publication2022 IEEE International Conference on Robotics and Automation, ICRA 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages675-682
Number of pages8
ISBN (Electronic)9781728196817
DOIs
StatePublished - 2022
Externally publishedYes
Event39th IEEE International Conference on Robotics and Automation, ICRA 2022 - Philadelphia, United States
Duration: 23 May 202227 May 2022

Publication series

NameProceedings - IEEE International Conference on Robotics and Automation
ISSN (Print)1050-4729

Conference

Conference39th IEEE International Conference on Robotics and Automation, ICRA 2022
Country/TerritoryUnited States
CityPhiladelphia
Period23/05/2227/05/22

Fingerprint

Dive into the research topics of 'Visually Grounding Language Instruction for History-Dependent Manipulation'. Together they form a unique fingerprint.

Cite this