Continuous Quantum Reinforcement Learning for Robot Navigation

Theodora Augustina Drăgan, Alexander Künzner, Robert Wille, Jeanette Miriam Lorenz

Research output: Contribution to journalConference articlepeer-review

Abstract

Oneofthemultiple facets of quantum reinforcement learning (QRL) is enhancing reinforcement learning (RL) algorithms with quantum submodules, namely with variational quantum circuits (VQC) as function approx imators. QRL solutions are empirically proven to require fewer training iterations or adjustable parameters than their classical counterparts, but are usually restricted to applications that have a discrete action space and thus limited industrial relevance. We propose a hybrid quantum-classical (HQC) deep deterministic policy gradient (DDPG) approach for a robot to navigate through a maze using continuous states, continuous actions and using local observations from the robot’s LiDAR sensors. We show that this HQC method can lead to models of comparable test results to the neural network (NN)-based DDPG algorithm, that need around 200 times fewer weights. We also study the scalability of our solution with respect to the number of VQC layers and qubits, and find that in general results improve as the layer and qubit counts increase. The best rewards among all similarly sized HQC and classical DDPG methods correspond to a VQC of 8 qubits and 5 layers with no other NN. This work is another step towards continuous QRL, where literature has been sparse.

Original languageEnglish
Pages (from-to)807-814
Number of pages8
JournalInternational Conference on Agents and Artificial Intelligence
Volume1
DOIs
StatePublished - 2025
Event17th International Conference on Agents and Artificial Intelligence, ICAART 2025 - Porto, Portugal
Duration: 23 Feb 202525 Feb 2025

Keywords

  • Continuous Action Space
  • LiDAR
  • Quantum Reinforcement Learning
  • Robot Navigation

Fingerprint

Dive into the research topics of 'Continuous Quantum Reinforcement Learning for Robot Navigation'. Together they form a unique fingerprint.

Cite this