Publication
Title
Safe learning for near-optimal scheduling
Author
Abstract
In this paper, we investigate the combination of synthesis, model-based learning, and online sampling techniques to obtain safe and near-optimal schedulers for a preemptible task scheduling problem. Our algorithms can handle Markov decision processes (MDPs) that have 1020 states and beyond which cannot be handled with state-of-the art probabilistic model-checkers. We provide probably approximately correct (PAC) guarantees for learning the model. Additionally, we extend Monte-Carlo tree search with advice, computed using safety games or obtained using the earliest-deadline-first scheduler, to safely explore the learned model online. Finally, we implemented and compared our algorithms empirically against shielded deep Q-learning on large task systems.
Language
English
Source (journal)
Lecture notes in computer science. - Berlin, 1973, currens
Source (book)
Quantitative Evaluation of Systems - 18th International Conference, QEST 2021, Paris, France, August 23-27, 2021
Source (series)
Theoretical computer science and general issues (LNTCS); 12846
Publication
Berlin : Springer , 2021
ISBN
978-3-030-85171-2
978-3-030-85172-9
DOI
10.1007/978-3-030-85172-9_13
Volume/pages
12846 (2021) , p. 235-254
ISI
000696682500013
Full text (Publisher's DOI)
Full text (open access)
UAntwerpen
Faculty/Department
Research group
Project info
SAILor: Safe Artificial Intelligence and Learning for Verification.
Publication type
Subject
Affiliation
Publications with a UAntwerp address
External links
Web of Science
Record
Identifier
Creation 23.08.2021
Last edited 02.10.2024
To cite this reference