Publication
Title
Bi-objective lexicographic optimization in Markov decision processes with related objectives
Author
Abstract
We consider lexicographic bi-objective problems on Markov Decision Processes (MDPs), where we optimize one objective while guaranteeing optimality of another. We propose a two-stage technique for solving such problems when the objectives are related (in a way that we formalize). We instantiate our technique for two natural pairs of objectives: minimizing the (conditional) expected number of steps to a target while guaranteeing the optimal probability of reaching it; and maximizing the (conditional) expected average reward while guaranteeing an optimal probability of staying safe (w.r.t. some safe set of states). For the first combination of objectives, which covers the classical frozen lake environment from reinforcement learning, we also report on experiments performed using a prototype implementation of our algorithm and compare it with what can be obtained from state-of-the-art probabilistic model checkers solving optimal reachability.
Language
English
Source (journal)
Lecture notes in computer science. - Berlin, 1973, currens
Source (book)
Automated Technology for Verification and Analysis : 21st International Symposium, ATVA 2023, Singapore, October 24–27, 2023
Source (series)
Lecture notes in computer science ; 14215
Publication
Berlin : Springer , 2023
ISBN
978-3-031-45328-1
DOI
10.1007/978-3-031-45329-8_10
Volume/pages
p. 203-223
Full text (Publisher's DOI)
UAntwerpen
Faculty/Department
Research group
Publication type
Subject
Affiliation
Publications with a UAntwerp address
External links
Record
Identifier
Creation 27.10.2023
Last edited 17.06.2024
To cite this reference