Title
|
|
|
|
Safe learning for near-optimal scheduling
|
|
Author
|
|
|
|
|
|
Abstract
|
|
|
|
In this paper, we investigate the combination of synthesis, model-based learning, and online sampling techniques to obtain safe and near-optimal schedulers for a preemptible task scheduling problem. Our algorithms can handle Markov decision processes (MDPs) that have 1020 states and beyond which cannot be handled with state-of-the art probabilistic model-checkers. We provide probably approximately correct (PAC) guarantees for learning the model. Additionally, we extend Monte-Carlo tree search with advice, computed using safety games or obtained using the earliest-deadline-first scheduler, to safely explore the learned model online. Finally, we implemented and compared our algorithms empirically against shielded deep Q-learning on large task systems. |
|
|
Language
|
|
|
|
English
|
|
Source (journal)
|
|
|
|
Lecture notes in computer science. - Berlin, 1973, currens
|
|
Source (book)
|
|
|
|
Quantitative Evaluation of Systems - 18th International Conference, QEST 2021, Paris, France, August 23-27, 2021
|
|
Source (series)
|
|
|
|
Theoretical computer science and general issues (LNTCS); 12846
|
|
Publication
|
|
|
|
Berlin
:
Springer
,
2021
|
|
ISBN
|
|
|
|
978-3-030-85171-2
978-3-030-85172-9
|
|
DOI
|
|
|
|
10.1007/978-3-030-85172-9_13
|
|
Volume/pages
|
|
|
|
12846
(2021)
, p. 235-254
|
|
ISI
|
|
|
|
000696682500013
|
|
Full text (Publisher's DOI)
|
|
|
|
|
|
Full text (open access)
|
|
|
|
|
|