Learning CPG sensory feedback with policy gradient for biped locomotion for a full-body humanoid

Gen Endo, Jun Morimoto, Takamitsu Matsubara, Jun Nakanishi, Gordon Cheng

Research output: Contribution to conferencePaperpeer-review

26 Scopus citations

Abstract

This paper describes a learning framework for a central pattern generator based biped locomotion controller using a policy gradient method. Our goals in this study are to achieve biped walking with a 3D hardware humanoid, and to develop an efficient learning algorithm with CPG by reducing the dimensionality of the state space used for learning. We demonstrate that an appropriate feed-back controller can be acquired within a thousand trials by numerical simulations and the obtained controller in numerical simulation achieves stable walking with a physical robot in the real world. Numerical simulations and hardware experiments evaluated walking velocity and stability. Furthermore, we present the possibility of an additional online learning using a hardware robot to improve the controller within 200 iterations.

Original languageEnglish
Pages1267-1273
Number of pages7
StatePublished - 2005
Externally publishedYes
Event20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05 - Pittsburgh, PA, United States
Duration: 9 Jul 200513 Jul 2005

Conference

Conference20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05
Country/TerritoryUnited States
CityPittsburgh, PA
Period9/07/0513/07/05

Fingerprint

Dive into the research topics of 'Learning CPG sensory feedback with policy gradient for biped locomotion for a full-body humanoid'. Together they form a unique fingerprint.

Cite this