Abstract
This paper describes a learning framework for a central pattern generator based biped locomotion controller using a policy gradient method. Our goals in this study are to achieve biped walking with a 3D hardware humanoid, and to develop an efficient learning algorithm with CPG by reducing the dimensionality of the state space used for learning. We demonstrate that an appropriate feed-back controller can be acquired within a thousand trials by numerical simulations and the obtained controller in numerical simulation achieves stable walking with a physical robot in the real world. Numerical simulations and hardware experiments evaluated walking velocity and stability. Furthermore, we present the possibility of an additional online learning using a hardware robot to improve the controller within 200 iterations.
| Originalsprache | Englisch |
|---|---|
| Seiten | 1267-1273 |
| Seitenumfang | 7 |
| Publikationsstatus | Veröffentlicht - 2005 |
| Extern publiziert | Ja |
| Veranstaltung | 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05 - Pittsburgh, PA, USA/Vereinigte Staaten Dauer: 9 Juli 2005 → 13 Juli 2005 |
Konferenz
| Konferenz | 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05 |
|---|---|
| Land/Gebiet | USA/Vereinigte Staaten |
| Ort | Pittsburgh, PA |
| Zeitraum | 9/07/05 → 13/07/05 |