Hindsight Experience Replay with Evolutionary Decision Trees for Curriculum Goal Generation

Erdi Sayar, Vladislav Vintaykin, Giovanni Iacca, Alois Knoll

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Reinforcement learning (RL) algorithms often require a significant number of experiences to learn a policy capable of achieving desired goals in multi-goal robot manipulation tasks with sparse rewards. Hindsight Experience Replay (HER) is an existing method that improves learning efficiency by using failed trajectories and replacing the original goals with hindsight goals that are uniformly sampled from the visited states. However, HER has a limitation: the hindsight goals are mostly near the initial state, which hinders solving tasks efficiently if the desired goals are far from the initial state. To overcome this limitation, we introduce a curriculum learning method called HERDT (HER with Decision Trees). HERDT uses binary DTs to generate curriculum goals that guide a robotic agent progressively from an initial state toward a desired goal. During the warm-up stage, DTs are optimized using the Grammatical Evolution algorithm. In the training stage, curriculum goals are then sampled by DTs to help the agent navigate the environment. Since binary DTs generate discrete values, we fine-tune these curriculum points by incorporating a feedback value (i.e., the Q-value). This fine-tuning enables us to adjust the difficulty level of the generated curriculum points, ensuring that they are neither overly simplistic nor excessively challenging. In other words, these points are precisely tailored to match the robot’s ongoing learning policy. We evaluate our proposed approach on different sparse reward robotic manipulation tasks and compare it with the state-of-the-art HER approach. Our results demonstrate that our method consistently outperforms or matches the existing approach in all the tested tasks.

Original languageEnglish
Title of host publicationApplications of Evolutionary Computation - 27th European Conference, EvoApplications 2024, Held as Part of EvoStar 2024, Proceedings
EditorsStephen Smith, João Correia, Christian Cintrano
PublisherSpringer Science and Business Media Deutschland GmbH
Pages3-18
Number of pages16
ISBN (Print)9783031568541
DOIs
StatePublished - 2024
Event27th European Conference on Applications of Evolutionary Computation, EvoApplications 2024 held as part of EvoStar 2024 - Aberystwyth, United Kingdom
Duration: 3 Apr 20245 Apr 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14635 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference27th European Conference on Applications of Evolutionary Computation, EvoApplications 2024 held as part of EvoStar 2024
Country/TerritoryUnited Kingdom
CityAberystwyth
Period3/04/245/04/24

Keywords

  • Curriculum Learning
  • Decision Tree
  • Multi-goal Tasks
  • Reinforcement Learning
  • Sparse Reward

Fingerprint

Dive into the research topics of 'Hindsight Experience Replay with Evolutionary Decision Trees for Curriculum Goal Generation'. Together they form a unique fingerprint.

Cite this