TY - JOUR
T1 - A general framework to increase safety of learning algorithms for dynamical systems based on region of attraction estimation
AU - Zhou, Zhehua
AU - Oguz, Ozgur S.
AU - Leibold, Marion
AU - Buss, Martin
N1 - Publisher Copyright:
© 2020 IEEE
PY - 2020/10
Y1 - 2020/10
N2 - Although the state-of-the-art learning approaches exhibit impressive results for dynamical systems, only a few applications on real physical systems have been presented. One major impediment is that the intermediate policy during the training procedure may result in behaviors that are not only harmful to the system itself but also to the environment. In essence, imposing safety guarantees for learning algorithms is vital for autonomous systems acting in the real world. In this article, we propose a computationally effective and general safe learning framework, specifically for complex dynamical systems. With a proper definition of the safe region, a supervisory control strategy, which switches the actions applied on the system between the learning-based controller and a predefined corrective controller, is given. A simplified system facilitates the estimation of the safe region for the high-dimensional dynamical system. During the learning phase, the belief of the safe region is updated with the actual execution results of the corrective controller, which in turn enables the learning-based controller to have more freedom in choosing its actions. Two examples are given to demonstrate the performance of the proposed framework, one simple inverted pendulum to illustrate the online adaptation method, and one quadcopter control task to show the overall performance.
AB - Although the state-of-the-art learning approaches exhibit impressive results for dynamical systems, only a few applications on real physical systems have been presented. One major impediment is that the intermediate policy during the training procedure may result in behaviors that are not only harmful to the system itself but also to the environment. In essence, imposing safety guarantees for learning algorithms is vital for autonomous systems acting in the real world. In this article, we propose a computationally effective and general safe learning framework, specifically for complex dynamical systems. With a proper definition of the safe region, a supervisory control strategy, which switches the actions applied on the system between the learning-based controller and a predefined corrective controller, is given. A simplified system facilitates the estimation of the safe region for the high-dimensional dynamical system. During the learning phase, the belief of the safe region is updated with the actual execution results of the corrective controller, which in turn enables the learning-based controller to have more freedom in choosing its actions. Two examples are given to demonstrate the performance of the proposed framework, one simple inverted pendulum to illustrate the online adaptation method, and one quadcopter control task to show the overall performance.
KW - Deep learning in robotics and automation
KW - Learning and adaptive systems
KW - Robot safety
KW - Safe reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85105852501&partnerID=8YFLogxK
U2 - 10.1109/TRO.2020.2992981
DO - 10.1109/TRO.2020.2992981
M3 - Article
AN - SCOPUS:85105852501
SN - 1552-3098
VL - 36
SP - 1472
EP - 1490
JO - IEEE Transactions on Robotics
JF - IEEE Transactions on Robotics
IS - 5
M1 - 9106864
ER -