A Critical Point Analysis of Actor-Critic Algorithms with Neural Networks

Martin Gottwald, Hao Shen, Klaus Diepold

Research output: Contribution to journalConference articlepeer-review

Abstract

We investigate Actor-Critic algorithms from the non-convex optimisation perspective. For the past years, powerful Deep Reinforcement Learning algorithms, such as Deep Deterministic Policy Gradients, have been observed to struggle even in tiny toy problems. Yet, only the critic training has been subject to intensive research. To close this gap, we conduct a critical point analysis for the actor training. First, we find that the reward function must satisfy additional conditions next to those for Deterministic Policy Gradients, such that the critic is a proper loss for the actor. Second, we address the impact of using over-parametrised Neural Networks in the actor part. If there are more parameters than samples, a Q-function has less sources of critical points with respect to its action input leading to better actor training. Additionally, critical points of the actor loss are only those, where the Q-function is extremal. Third, we outline challenges in the formulation of a sound optimisation task. They arise due to conflicting requirements between Reinforcement Learning and Neural Network architectures.

Original languageEnglish
Pages (from-to)27-32
Number of pages6
JournalIFAC Proceedings Volumes (IFAC-PapersOnline)
Volume55
Issue number15
DOIs
StatePublished - 1 Jul 2022
Event6th IFAC Conference on Intelligent Control and Automation Sciences, ICONS 2022 - Cluj-Napoca, Romania
Duration: 13 Jul 202215 Jul 2022

Keywords

  • Critical Points
  • Dynamic Programming
  • Function Approximation
  • Markov Decision Process
  • Optimisation

Fingerprint

Dive into the research topics of 'A Critical Point Analysis of Actor-Critic Algorithms with Neural Networks'. Together they form a unique fingerprint.

Cite this