Finding Safe Zones of policies Markov Decision Processes

Lee Cohen; Yishay Mansour; Michal Moshkovitz

Finding Safe Zones of policies Markov Decision Processes

Lee Cohen

Yishay Mansour

Michal Moshkovitz

NeurIPS 2023 (2023)

Download Google Scholar

Abstract

Given a policy, we define a {\newprob}%\emph{safe zone}
as a subset of states, such that most of the policy's trajectories are confined to this subset.
The quality of the {\newprob }%safe zone
is parameterized by the number of states and the escape probability, i.e., the probability that a random trajectory will leave the subset. Safe zones are especially interesting when they have a small number of states and low escape probability.

We study the complexity of finding optimal {\textsc{safeZones}},
and show that in general the problem is computationally hard. For this reason we concentrate on computing approximate {\textsc{safeZones}.}
Our main result is a bi-criteria approximation algorithm which gives a factor of almost $2$ approximation for both the escape probability and safe zone size, using a polynomial size sample complexity.
We conclude the paper with an empirical evaluation of our algorithm.

Research Areas

Machine intelligence

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Finding Safe Zones of policies Markov Decision Processes

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs