TY - JOUR
T1 - Curriculum Learning for Robot Manipulation Tasks With Sparse Reward Through Environment Shifts
AU - Sayar, Erdi
AU - Iacca, Giovanni
AU - Knoll, Alois
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2024
Y1 - 2024
N2 - Multi-goal reinforcement learning (RL) with sparse rewards poses a significant challenge for RL methods. Hindsight experience replay (HER) addresses this challenge by learning from failures and replacing the desired goals with achieved states. However, HER often becomes inefficient when the desired goals are far away from the initial states. This paper introduces co-adapting hindsight experience replay with environment shifts (in short, COHER). COHER generates progressively more complex tasks as soon as the agent's success surpasses a predefined threshold. The generated tasks and agent are coupled to optimize the behavior of the agent within each task-agent pair. We evaluate COHER on various sparse reward robotic tasks that require obstacle avoidance capabilities and compare COHER with hindsight goal generation (HGG), curriculum-guided hindsight experience replay (CHER), and vanilla HER. The results show that COHER consistently outperforms the other methods and that the obtained policies can avoid obstacles without having explicit information about their position. Lastly, we deploy such policies to a real Franka robot for Sim2Real analysis. We observe that the robot can achieve the task by avoiding obstacles, whereas policies obtained with other methods cannot. The videos and code are publicly available at: https://erdiphd.github.io/COHER/.
AB - Multi-goal reinforcement learning (RL) with sparse rewards poses a significant challenge for RL methods. Hindsight experience replay (HER) addresses this challenge by learning from failures and replacing the desired goals with achieved states. However, HER often becomes inefficient when the desired goals are far away from the initial states. This paper introduces co-adapting hindsight experience replay with environment shifts (in short, COHER). COHER generates progressively more complex tasks as soon as the agent's success surpasses a predefined threshold. The generated tasks and agent are coupled to optimize the behavior of the agent within each task-agent pair. We evaluate COHER on various sparse reward robotic tasks that require obstacle avoidance capabilities and compare COHER with hindsight goal generation (HGG), curriculum-guided hindsight experience replay (CHER), and vanilla HER. The results show that COHER consistently outperforms the other methods and that the obtained policies can avoid obstacles without having explicit information about their position. Lastly, we deploy such policies to a real Franka robot for Sim2Real analysis. We observe that the robot can achieve the task by avoiding obstacles, whereas policies obtained with other methods cannot. The videos and code are publicly available at: https://erdiphd.github.io/COHER/.
KW - Curriculum learning-based reinforcement learning
KW - hindsight experience replay
KW - multi-goal reinforcement learning
KW - robotic control
UR - http://www.scopus.com/inward/record.url?scp=85189135636&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2024.3382264
DO - 10.1109/ACCESS.2024.3382264
M3 - Article
AN - SCOPUS:85189135636
SN - 2169-3536
VL - 12
SP - 46626
EP - 46635
JO - IEEE Access
JF - IEEE Access
ER -