Direct policy search as an alternative to POMDP for sequential decision problems in infrastructure planning

Elizabeth Bismut, Daniel Straub

Research output: Contribution to conferencePaperpeer-review

3 Scopus citations

Abstract

Most infrastructure planning challenges belong to the class of sequential decision problems, characterized by significant initial uncertainty on the demand and performance of the system, and the possibility to collect information and reduce the uncertainty throughout the service life. In this paper, we consider a generic infrastructure planning problem and compare two main solution frameworks, partially observable Markov decision processes (POMDPs) and direct policy search (DPS) with a choice of heuristics. A case study is set up so that the the belief space is described by only two parameters and the POMDP approach yields an exact solution. We investigate the performance of direct policy search by examining the optimal choice of the heuristics through a comparison with the POMDP solution. Depending on the type of system and reward function considered, parameters defining the heuristics can be thresholds on the demand or the system reliability, after which intervention is required, or critical damage values that suggest a component repair. The choice of the parameters directly influences the goodness of the solution found. We identify key factors for the optimal selection of the heuristic parameters for the generic problem, and provide insights into which specific features of the system that guide the decision process.

Original languageEnglish
StatePublished - 2019
Event13th International Conference on Applications of Statistics and Probability in Civil Engineering, ICASP 2019 - Seoul, Korea, Republic of
Duration: 26 May 201930 May 2019

Conference

Conference13th International Conference on Applications of Statistics and Probability in Civil Engineering, ICASP 2019
Country/TerritoryKorea, Republic of
CitySeoul
Period26/05/1930/05/19

Fingerprint

Dive into the research topics of 'Direct policy search as an alternative to POMDP for sequential decision problems in infrastructure planning'. Together they form a unique fingerprint.

Cite this