TY - JOUR
T1 - Learning a Low-Dimensional Representation of a Safe Region for Safe Reinforcement Learning on Dynamical Systems
AU - Zhou, Zhehua
AU - Oguz, Ozgur S.
AU - Leibold, Marion
AU - Buss, Martin
N1 - Publisher Copyright:
© 2012 IEEE.
PY - 2023/5/1
Y1 - 2023/5/1
N2 - For the safe application of reinforcement learning algorithms to high-dimensional nonlinear dynamical systems, a simplified system model is used to formulate a safe reinforcement learning (SRL) framework. Based on the simplified system model, a low-dimensional representation of the safe region is identified and used to provide safety estimates for learning algorithms. However, finding a satisfying simplified system model for complex dynamical systems usually requires a considerable amount of effort. To overcome this limitation, we propose a general data-driven approach that is able to efficiently learn a low-dimensional representation of the safe region. By employing an online adaptation method, the low-dimensional representation is updated using the feedback data to obtain more accurate safety estimates. The performance of the proposed approach for identifying the low-dimensional representation of the safe region is illustrated using the example of a quadcopter. The results demonstrate a more reliable and representative low-dimensional representation of the safe region compared with previous works, which extends the applicability of the SRL framework.
AB - For the safe application of reinforcement learning algorithms to high-dimensional nonlinear dynamical systems, a simplified system model is used to formulate a safe reinforcement learning (SRL) framework. Based on the simplified system model, a low-dimensional representation of the safe region is identified and used to provide safety estimates for learning algorithms. However, finding a satisfying simplified system model for complex dynamical systems usually requires a considerable amount of effort. To overcome this limitation, we propose a general data-driven approach that is able to efficiently learn a low-dimensional representation of the safe region. By employing an online adaptation method, the low-dimensional representation is updated using the feedback data to obtain more accurate safety estimates. The performance of the proposed approach for identifying the low-dimensional representation of the safe region is illustrated using the example of a quadcopter. The results demonstrate a more reliable and representative low-dimensional representation of the safe region compared with previous works, which extends the applicability of the SRL framework.
KW - Data-driven model order reduction
KW - deep learning in robotics and automation
KW - learning and adaptive systems
KW - safe reinforcement learning (SRL)
UR - http://www.scopus.com/inward/record.url?scp=85114733612&partnerID=8YFLogxK
U2 - 10.1109/TNNLS.2021.3106818
DO - 10.1109/TNNLS.2021.3106818
M3 - Article
AN - SCOPUS:85114733612
SN - 2162-237X
VL - 34
SP - 2513
EP - 2527
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
IS - 5
ER -