TY - GEN
T1 - Verification of markov decision processes using learning algorithms
AU - Braázdil, Tomaásš
AU - Chatterjee, Krishnendu
AU - Chmelík, Martin
AU - Forejt, Vojtěech
AU - Křetínskýy, Jan
AU - Kwiatkowska, Marta
AU - Parker, David
AU - Ujma, Mateusz
N1 - Publisher Copyright:
© 2014 Springer International Publishing Switzerland.
PY - 2014
Y1 - 2014
N2 - We present a general framework for applying machine-learning algorithms to the verification of Markov decision processes (MDPs). The primary goal of these techniques is to improve performance by avoiding an exhaustive exploration of the state space. Our framework focuses on probabilistic reachability, which is a core property for verification, and is illustrated through two distinct instantiations. The first assumes that full knowledge of the MDP is available, and performs a heuristic-driven partial exploration of the model, yielding precise lower and upper bounds on the required probability. The second tackles the case where we may only sample the MDP, and yields probabilistic guarantees, again in terms of both the lower and upper bounds, which provides efficient stopping criteria for the approximation. The latter is the first extension of statistical model checking for unbounded properties inMDPs. In contrast with other related techniques, our approach is not restricted to time-bounded (finite-horizon) or discounted properties, nor does it assume any particular properties of the MDP. We also show how our methods extend to LTL objectives. We present experimental results showing the performance of our framework on several examples.
AB - We present a general framework for applying machine-learning algorithms to the verification of Markov decision processes (MDPs). The primary goal of these techniques is to improve performance by avoiding an exhaustive exploration of the state space. Our framework focuses on probabilistic reachability, which is a core property for verification, and is illustrated through two distinct instantiations. The first assumes that full knowledge of the MDP is available, and performs a heuristic-driven partial exploration of the model, yielding precise lower and upper bounds on the required probability. The second tackles the case where we may only sample the MDP, and yields probabilistic guarantees, again in terms of both the lower and upper bounds, which provides efficient stopping criteria for the approximation. The latter is the first extension of statistical model checking for unbounded properties inMDPs. In contrast with other related techniques, our approach is not restricted to time-bounded (finite-horizon) or discounted properties, nor does it assume any particular properties of the MDP. We also show how our methods extend to LTL objectives. We present experimental results showing the performance of our framework on several examples.
UR - http://www.scopus.com/inward/record.url?scp=84908682241&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-11936-6_8
DO - 10.1007/978-3-319-11936-6_8
M3 - Conference contribution
AN - SCOPUS:84908682241
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 98
EP - 114
BT - Automated Technology for Verification and Analysis - 12th International Symposium, ATVA 2014, Proceedings
A2 - Cassez, Franck
A2 - Raskin, Jean-François
PB - Springer Verlag
T2 - 12th International Symposium on Automated Technology for Verification and Analysis, ATVA 2014
Y2 - 3 November 2014 through 7 November 2014
ER -