Statistical performance of support vector machines
Abstract
The support vector machine (SVM) algorithm is well known to the computer
learning community for its very good practical results. The goal of the
present paper is to study this algorithm from a statistical perspective, using
tools of concentration theory and empirical processes.
Our main result builds on the observation made by other authors that the
SVM can be viewed as a statistical regularization procedure. From this point
of view, it can also be interpreted as a model selection principle using a penalized
criterion. It is then possible to adapt general methods related to model
selection in this framework to study two important points: (1) what is the
minimum penalty and how does it compare to the penalty actually used in the
SVM algorithm; (2) is it possible to obtain “oracle inequalities” in that setting,
for the specific loss function used in the SVM algorithm? We show that
the answer to the latter question is positive and provides relevant insight to the
former. Our result shows that it is possible to obtain fast rates of convergence
for SVMs.
learning community for its very good practical results. The goal of the
present paper is to study this algorithm from a statistical perspective, using
tools of concentration theory and empirical processes.
Our main result builds on the observation made by other authors that the
SVM can be viewed as a statistical regularization procedure. From this point
of view, it can also be interpreted as a model selection principle using a penalized
criterion. It is then possible to adapt general methods related to model
selection in this framework to study two important points: (1) what is the
minimum penalty and how does it compare to the penalty actually used in the
SVM algorithm; (2) is it possible to obtain “oracle inequalities” in that setting,
for the specific loss function used in the SVM algorithm? We show that
the answer to the latter question is positive and provides relevant insight to the
former. Our result shows that it is possible to obtain fast rates of convergence
for SVMs.