TY - GEN
T1 - Cloud-Edge Training Architecture for Sim-to-Real Deep Reinforcement Learning
AU - Cao, Hongpeng
AU - Theile, Mirco
AU - Wyrwal, Federico G.
AU - Caccamo, Marco
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Deep reinforcement learning (DRL) is a promising approach to solve complex control tasks by learning policies through interactions with the environment. However, the training of DRL policies requires large amounts of training experiences, making it impractical to learn the policy directly on physical systems. Sim-to-real approaches leverage simulations to pretrain DRL policies and then deploy them in the real world. Unfortunately, the direct real-world deployment of pretrained policies usually suffers from performance deterioration due to the different dynamics, known as the reality gap. Recent sim-to-real methods, such as domain randomization and domain adaptation, focus on improving the robustness of the pretrained agents. Nevertheless, the simulation-trained policies often need to be tuned with real-world data to reach optimal performance, which is challenging due to the high cost of real-world samples. This work proposes a distributed cloud-edge architecture to train DRL agents in the real world in real-time. In the architecture, the inference and training are assigned to the edge and cloud, separating the real-time control loop from the computationally expensive training loop. To overcome the reality gap, our architecture exploits sim-to-real transfer strategies to continue the training of simulation-pretrained agents on a physical system. We demonstrate its applicability on a physical inverted-pendulum control system, analyzing critical parameters. The real-world experiments show that our architecture can adapt the pretrained DRL agents to unseen dynamics consistently and efficiently.11A video showing a real-world training process under the proposed method can be found from https://youtu.be/hMY9-c0SST0.
AB - Deep reinforcement learning (DRL) is a promising approach to solve complex control tasks by learning policies through interactions with the environment. However, the training of DRL policies requires large amounts of training experiences, making it impractical to learn the policy directly on physical systems. Sim-to-real approaches leverage simulations to pretrain DRL policies and then deploy them in the real world. Unfortunately, the direct real-world deployment of pretrained policies usually suffers from performance deterioration due to the different dynamics, known as the reality gap. Recent sim-to-real methods, such as domain randomization and domain adaptation, focus on improving the robustness of the pretrained agents. Nevertheless, the simulation-trained policies often need to be tuned with real-world data to reach optimal performance, which is challenging due to the high cost of real-world samples. This work proposes a distributed cloud-edge architecture to train DRL agents in the real world in real-time. In the architecture, the inference and training are assigned to the edge and cloud, separating the real-time control loop from the computationally expensive training loop. To overcome the reality gap, our architecture exploits sim-to-real transfer strategies to continue the training of simulation-pretrained agents on a physical system. We demonstrate its applicability on a physical inverted-pendulum control system, analyzing critical parameters. The real-world experiments show that our architecture can adapt the pretrained DRL agents to unseen dynamics consistently and efficiently.11A video showing a real-world training process under the proposed method can be found from https://youtu.be/hMY9-c0SST0.
UR - http://www.scopus.com/inward/record.url?scp=85146358692&partnerID=8YFLogxK
U2 - 10.1109/IROS47612.2022.9981565
DO - 10.1109/IROS47612.2022.9981565
M3 - Conference contribution
AN - SCOPUS:85146358692
T3 - IEEE International Conference on Intelligent Robots and Systems
SP - 9363
EP - 9370
BT - IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2022
Y2 - 23 October 2022 through 27 October 2022
ER -