Jump to Content

Finding Safe Zones of policies Markov Decision Processes

Lee Cohen
Michal Moshkovitz
NeurIPS 2023 (2023)

Abstract

Given a policy, we define a {\newprob}%\emph{safe zone} as a subset of states, such that most of the policy's trajectories are confined to this subset. The quality of the {\newprob }%safe zone is parameterized by the number of states and the escape probability, i.e., the probability that a random trajectory will leave the subset. Safe zones are especially interesting when they have a small number of states and low escape probability. We study the complexity of finding optimal {\textsc{safeZones}}, and show that in general the problem is computationally hard. For this reason we concentrate on computing approximate {\textsc{safeZones}.} Our main result is a bi-criteria approximation algorithm which gives a factor of almost $2$ approximation for both the escape probability and safe zone size, using a polynomial size sample complexity. We conclude the paper with an empirical evaluation of our algorithm.

Research Areas