AutoAugment: Learning Augmentation Policies from Data
Abstract
In this paper, we take a closer look at data augmentation for images, and describe a
simple procedure called AutoAugment to search for improved data augmentation
policies. Our key insight is to create a search space of data augmentation policies,
evaluating the quality of a particular policy directly on the dataset of interest. In
our implementation, we have designed a search space where a policy consists
of many sub-policies, one of which is randomly chosen for each image in each
mini-batch. A sub-policy consists of two operations, each operation being an image
processing function such as translation, rotation, or shearing, and the probabilities
and magnitudes with which the functions are applied. We use a search algorithm
to find the best policy such that the neural network yields the highest validation
accuracy on a target dataset. Our method achieves state-of-the-art accuracy on
CIFAR-10, CIFAR-100, SVHN, and ImageNet (without additional data). On
ImageNet, we attain a Top-1 accuracy of 83.54%. On CIFAR-10, we achieve
an error rate of 1.48%, which is 0.65% better than the previous state-of-the-art.
Finally, policies learned from one dataset can be transferred to work well on other
similar datasets. For example, the policy learned on ImageNet allows us to achieve
state-of-the-art accuracy on the fine grained visual classification dataset Stanford
Cars, without fine-tuning weights pre-trained on additional data. Code to train
Wide-ResNet, Shake-Shake and ShakeDrop models with AutoAugment policies
can be found at https://github.com/tensorflow/models/tree/master/research/autoaugment
simple procedure called AutoAugment to search for improved data augmentation
policies. Our key insight is to create a search space of data augmentation policies,
evaluating the quality of a particular policy directly on the dataset of interest. In
our implementation, we have designed a search space where a policy consists
of many sub-policies, one of which is randomly chosen for each image in each
mini-batch. A sub-policy consists of two operations, each operation being an image
processing function such as translation, rotation, or shearing, and the probabilities
and magnitudes with which the functions are applied. We use a search algorithm
to find the best policy such that the neural network yields the highest validation
accuracy on a target dataset. Our method achieves state-of-the-art accuracy on
CIFAR-10, CIFAR-100, SVHN, and ImageNet (without additional data). On
ImageNet, we attain a Top-1 accuracy of 83.54%. On CIFAR-10, we achieve
an error rate of 1.48%, which is 0.65% better than the previous state-of-the-art.
Finally, policies learned from one dataset can be transferred to work well on other
similar datasets. For example, the policy learned on ImageNet allows us to achieve
state-of-the-art accuracy on the fine grained visual classification dataset Stanford
Cars, without fine-tuning weights pre-trained on additional data. Code to train
Wide-ResNet, Shake-Shake and ShakeDrop models with AutoAugment policies
can be found at https://github.com/tensorflow/models/tree/master/research/autoaugment