A Comparison of Value-based and Policy-based Reinforcement Learning for Monitoring-informed Railway Maintenance Planning

Giacomo Arcieri, Cyprien Hoelzl, Oliver Schwery, Daniel Straub, Konstantinos G. Papakonstantinou, Eleni Chatzi

Publikation: Beitrag in Buch/Bericht/KonferenzbandKonferenzbeitragBegutachtung

Abstract

Optimal maintenance planning for railway infrastructure and assets forms a complex sequential decision-making problem. Railways are naturally subject to deterioration, which can result in compromised service and increased safety risks and costs. Maintenance actions ought to be proactively planned to prevent the adverse effects of deterioration and the associated costs. Such predictive actions can be planned based on monitoring data, which are often indirect and noisy, thus offering an uncertain assessment of the railway condition. From a mathematical perspective, this forms a stochastic control problem under data uncertainty, which can be cast as a Partially Observable Markov Decision Process (POMDP). In this work, we model the real-world problem of railway optimal maintenance planning as a POMDP, with the problem parameters inferred from real-world monitoring data. The POMDP model serves to infer beliefs over a set of hidden states, which aim to capture the evolution of the underlying deterioration process. The maintenance optimization problem is here ultimately solved via the use of deep Reinforcement Learning (RL) techniques, which allow for a more flexible and broad search over the policy space when compared to classical POMDP solution algorithms. A comparison of value-based and policy-based RL methods is also offered, which exploit deep learning architectures to model either action-value functions (i.e., the expected returns from an action-state pair) or directly the policy. Our work shows how this complex planning problem can be effectively solved via deep RL to derive an optimized maintenance policy of railway tracks, demonstrated on real-world monitoring data, and offers insights into the solution provided by different classes of RL algorithms.

OriginalspracheEnglisch
TitelStructural Health Monitoring 2023
UntertitelDesigning SHM for Sustainability, Maintainability, and Reliability - Proceedings of the 14th International Workshop on Structural Health Monitoring
Redakteure/-innenSaman Farhangdoust, Alfredo Guemes, Fu-Kuo Chang
Herausgeber (Verlag)DEStech Publications
Seiten2449-2459
Seitenumfang11
ISBN (elektronisch)9781605956930
PublikationsstatusVeröffentlicht - 2023
Veranstaltung14th International Workshop on Structural Health Monitoring: Designing SHM for Sustainability, Maintainability, and Reliability, IWSHM 2023 - Stanford, USA/Vereinigte Staaten
Dauer: 12 Sept. 202314 Sept. 2023

Publikationsreihe

NameStructural Health Monitoring 2023: Designing SHM for Sustainability, Maintainability, and Reliability - Proceedings of the 14th International Workshop on Structural Health Monitoring

Konferenz

Konferenz14th International Workshop on Structural Health Monitoring: Designing SHM for Sustainability, Maintainability, and Reliability, IWSHM 2023
Land/GebietUSA/Vereinigte Staaten
OrtStanford
Zeitraum12/09/2314/09/23

Fingerprint

Untersuchen Sie die Forschungsthemen von „A Comparison of Value-based and Policy-based Reinforcement Learning for Monitoring-informed Railway Maintenance Planning“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren