SOS: Safe, optimal and small strategies for hybrid markov decision processes

Pranav Ashok, Jan Křetínský, Kim Guldstrand Larsen, Adrien Le Coënt, Jakob Haahr Taankvist, Maximilian Weininger

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

22 Scopus citations

Abstract

For hybrid Markov decision processes, Stratego can compute strategies that are safe for a given safety property and (in the limit) optimal for a given cost function. Unfortunately, these strategies cannot be exported easily since they are computed as a very long list. In this paper, we demonstrate methods to learn compact representations of the strategies in the form of decision trees. These decision trees are much smaller, more understandable, and can easily be exported as code that can be loaded into embedded systems. Despite the size compression and actual differences to the original strategy, we provide guarantees on both safety and optimality of the decision-tree strategy. On the top, we show how to obtain yet smaller representations, which are still guaranteed safe, but achieve a desired trade-off between size and optimality.

Original languageEnglish
Title of host publicationQuantitative Evaluation of Systems - 16th International Conference, QEST 2019, Proceedings
EditorsDavid Parker, Verena Wolf
PublisherSpringer Verlag
Pages147-164
Number of pages18
ISBN (Print)9783030302801
DOIs
StatePublished - 2019
Event16th International Conference on Quantitative Evaluation of Systems, QEST 2019 - Glasgow, United Kingdom
Duration: 10 Sep 201912 Sep 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11785 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th International Conference on Quantitative Evaluation of Systems, QEST 2019
Country/TerritoryUnited Kingdom
CityGlasgow
Period10/09/1912/09/19

Fingerprint

Dive into the research topics of 'SOS: Safe, optimal and small strategies for hybrid markov decision processes'. Together they form a unique fingerprint.

Cite this