NeurIPS 2021

## Abstract

We give an $(\eps,\delta)$-differentially private algorithm for the Multi-Armed Bandit (MAB) problem in the shuffle model with a distribution-dependent regret of $O\left(\left(\sum_{a:\Delta_a>0}\frac{\log T}{\Delta_a}\right)+\frac{k\sqrt{\log\frac{1}{\delta}}\log T}{\eps}\right)$, and a distribution-independent regret of $O\left(\sqrt{kT\log T}+\frac{k\sqrt{\log\frac{1}{\delta}}\log T}{\eps}\right)$, where $T$ is the number of rounds, $\Delta_a$ is the suboptimality gap of the action $a$, and $k$ is the total number of actions.

```
Our upper bound almost matches the regret of the best known algorithms for the centralized model, and significantly outperforms the best known algorithm in the local model.
```

## Research Areas

### Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work