Density estimation for shift-invariant multidimensional distributions
Abstract
We study density estimation for classes of shift-invariant
distributions over R^d. A multidimensional distribution is
``shift-invariant'' if, roughly speaking, it is close in total
variation distance to a small shift of it in any direction. Shift-invariance relaxes smoothness
assumptions commonly used in non-parametric density estimation to
allow jump discontinuities. The different classes of distributions
that we consider correspond to different rates of tail decay.
For each such class we give an efficient algorithm that learns any
distribution in the class from independent samples with respect to
total variation distance. As a special case of our general result, we
show that multivariate log-concave distributions with a constant
number of variables can be learned in polynomial time, answering a
question of Diakonikolas et al. All of our results extend to a model
of noise-tolerant density estimation, in which the target distribution
to be learned is a (1-eps,eps) mixture of some unknown distribution in
the class with some other arbitrary and unknown distribution, and the
learning algorithm must output a hypothesis distribution with total
variation distance error O(eps) from the target distribution. We show
that our general results are close to best possible by proving a
simple information-theoretic lower bound on sample complexity even for
learning bounded distributions which are shift-invariant.