TY - GEN
T1 - 1–2–3–Go! Policy Synthesis for Parameterized Markov Decision Processes via Decision-Tree Learning and Generalization
AU - Azeem, Muqsit
AU - Chakraborty, Debraj
AU - Kanav, Sudeep
AU - Křetínský, Jan
AU - Mohagheghi, Mohammadsadegh
AU - Mohr, Stefanie
AU - Weininger, Maximilian
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - Despite the advances in probabilistic model checking, the scalability of the verification methods remains limited. In particular, the state space often becomes extremely large when instantiating parameterized Markov decision processes (MDPs) even with moderate values. Synthesizing policies for such huge MDPs is beyond the reach of available tools. We propose a learning-based approach to obtain a reasonable policy for such huge MDPs. The idea is to generalize optimal policies obtained by model-checking small instances to larger ones using decision-tree learning. Consequently, our method bypasses the need for explicit state-space exploration of large models, providing a practical solution to the state-space explosion problem. We demonstrate the efficacy of our approach by performing extensive experimentation on the relevant models from the quantitative verification benchmark set. The experimental results indicate that our policies perform well, even when the size of the model is orders of magnitude beyond the reach of state-of-the-art analysis tools.
AB - Despite the advances in probabilistic model checking, the scalability of the verification methods remains limited. In particular, the state space often becomes extremely large when instantiating parameterized Markov decision processes (MDPs) even with moderate values. Synthesizing policies for such huge MDPs is beyond the reach of available tools. We propose a learning-based approach to obtain a reasonable policy for such huge MDPs. The idea is to generalize optimal policies obtained by model-checking small instances to larger ones using decision-tree learning. Consequently, our method bypasses the need for explicit state-space exploration of large models, providing a practical solution to the state-space explosion problem. We demonstrate the efficacy of our approach by performing extensive experimentation on the relevant models from the quantitative verification benchmark set. The experimental results indicate that our policies perform well, even when the size of the model is orders of magnitude beyond the reach of state-of-the-art analysis tools.
KW - Markov decision process
KW - model checking
KW - policy synthesis
KW - probabilistic verification
UR - http://www.scopus.com/inward/record.url?scp=85218471830&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-82703-7_5
DO - 10.1007/978-3-031-82703-7_5
M3 - Conference contribution
AN - SCOPUS:85218471830
SN - 9783031827020
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 97
EP - 120
BT - Verification, Model Checking, and Abstract Interpretation - 26th International Conference, VMCAI 2025, Proceedings
A2 - Shankaranarayanan, Krishna
A2 - Sankaranarayanan, Sriram
A2 - Trivedi, Ashutosh
PB - Springer Science and Business Media Deutschland GmbH
T2 - 26th International Conference on Verification, Model Checking, and Abstract Interpretation, VMCAI 2025
Y2 - 20 January 2025 through 21 January 2025
ER -