The Geometry of Random Features
Abstract
We present an in-depth examination of the effectiveness of radial basis function kernel (beyond
Gaussian) estimators based on orthogonal random feature maps. We show that orthogonal estimators
outperform state-of-the-art mechanisms that use iid sampling under weak conditions for tails of the
associated Fourier distributions. We prove that for the case of many dimensions, the superiority of the orthogonal transform over iid methods can be accurately measured by a property we define called the charm of the kernel, and that orthogonal random features provide optimal kernel estimators.
Furthermore, we provide the first theoretical results which explain why orthogonal random features outperform unstructured on downstream tasks such as kernel ridge regression by showing that orthogonal random features provide kernel algorithms with better spectral properties than the previous state-of-the-art. Our results enable practitioners more generally to estimate the benefits from applying orthogonal transforms.
Gaussian) estimators based on orthogonal random feature maps. We show that orthogonal estimators
outperform state-of-the-art mechanisms that use iid sampling under weak conditions for tails of the
associated Fourier distributions. We prove that for the case of many dimensions, the superiority of the orthogonal transform over iid methods can be accurately measured by a property we define called the charm of the kernel, and that orthogonal random features provide optimal kernel estimators.
Furthermore, we provide the first theoretical results which explain why orthogonal random features outperform unstructured on downstream tasks such as kernel ridge regression by showing that orthogonal random features provide kernel algorithms with better spectral properties than the previous state-of-the-art. Our results enable practitioners more generally to estimate the benefits from applying orthogonal transforms.