TY - GEN
T1 - A robust stability approach to robot reinforcement learning based on a parameterization of stabilizing controllers
AU - Friedrich, Stefan R.
AU - Buss, Martin
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/7/21
Y1 - 2017/7/21
N2 - Reinforcement learning has become more and more popular in robotics for acquiring feedback controllers. Many approaches aim for learning a controller from scratch, i.e., data-driven without any modeling of the physical plant. However, stability properties of the closed loop are often not considered, or established only a-posteriori or ad hoc. We propose to employ reinforcement learning in the context of model-based control, allowing to learn in a framework of stabilizing controllers built by using only little prior model knowledge. This way, the action space is suitably structured for safe learning of a feedback controller to compensate for uncertainties due to model mismatch or external disturbances. The resulting scheme is developed around a decentralized PD feedback controller. Therefore, given such a controller, by the proposed method one can also add a learning module for performance enhancement. We demonstrate our approach both in simulation and in a hardware experiment using a two degree of freedom robot manipulator.
AB - Reinforcement learning has become more and more popular in robotics for acquiring feedback controllers. Many approaches aim for learning a controller from scratch, i.e., data-driven without any modeling of the physical plant. However, stability properties of the closed loop are often not considered, or established only a-posteriori or ad hoc. We propose to employ reinforcement learning in the context of model-based control, allowing to learn in a framework of stabilizing controllers built by using only little prior model knowledge. This way, the action space is suitably structured for safe learning of a feedback controller to compensate for uncertainties due to model mismatch or external disturbances. The resulting scheme is developed around a decentralized PD feedback controller. Therefore, given such a controller, by the proposed method one can also add a learning module for performance enhancement. We demonstrate our approach both in simulation and in a hardware experiment using a two degree of freedom robot manipulator.
UR - http://www.scopus.com/inward/record.url?scp=85028007188&partnerID=8YFLogxK
U2 - 10.1109/ICRA.2017.7989382
DO - 10.1109/ICRA.2017.7989382
M3 - Conference contribution
AN - SCOPUS:85028007188
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 3365
EP - 3372
BT - ICRA 2017 - IEEE International Conference on Robotics and Automation
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2017 IEEE International Conference on Robotics and Automation, ICRA 2017
Y2 - 29 May 2017 through 3 June 2017
ER -