- Aparna Taneja
- Jackson Killian
- Lily Xu
- Milind Shashikant Tambe
- Paula Rodriguez Diaz
Abstract
Restless Multi-Armed Bandits (RMABs) are an important model that enable optimizing allocation of limited resources in sequential decision-making settings. Typical RMABs assume the budget --- the number of arms pulled --- per round to be fixed for each step in the planning horizon. However, when planning in real-world settings, resources are not necessarily limited at each planning step; we may be able to distribute surplus resources in one round to an earlier or later round. Often this flexibility in budget is constrained to within a subset of consecutive planning steps. In this paper we define a general class of RMABs with flexible budget, which we term F-RMABs, and provide an algorithm to optimally solve for them. Additionally, we provide heuristics that tradeoff solution quality for efficiency and present experimental comparisons of different F-RMAB solution approaches.
Research Areas
Learn more about how we do research
We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work