TF-Slim: A high level library to define complex models in TensorFlow
August 30, 2016
Posted by Nathan Silberman and Sergio Guadarrama, Google Research
Quick links
Earlier this year, we released a TensorFlow implementation of a state-of-the-art image classification model known as Inception-V3. This code allowed users to train the model on the ImageNet classification dataset via synchronized gradient descent, using either a single local machine or a cluster of machines. The Inception-V3 model was built on an experimental TensorFlow library called TF-Slim, a lightweight package for defining, training and evaluating models in TensorFlow. The TF-Slim library provides common abstractions which enable users to define models quickly and concisely, while keeping the model architecture transparent and its hyperparameters explicit.
Since that release, TF-Slim has grown substantially, with many types of layers, loss functions, and evaluation metrics added, along with handy routines for training and evaluating models. These routines take care of all the details you need to worry about when working at scale, such as reading data in parallel, deploying models on multiple machines, and more. Additionally, we have created the TF-Slim Image Models library, which provides definitions and training scripts for many widely used image classification models, using standard datasets. TF-Slim and its components are already widely used within Google, and many of these improvements have already been integrated into tf.contrib.slim.
Today, we are proud to share the latest release of TF-Slim with the TF community. Some highlights of this release include:
- Many new kinds of layers (such as Atrous Convolution and Deconvolution) enabling a much richer family of neural network architectures.
- Support for more loss functions and evaluation metrics (e.g., mAP, IoU).
- A deployment library to make it easier to perform synchronous or asynchronous training using multiple GPUs/CPUs, on the same machine or on multiple machines.
- Code to define and train many widely used image classification models (e.g., Inception[1][2][3], VGG[4], AlexNet[5], ResNet[6]).
- Pre-trained model weights for the above image classification models. These models have been trained on the ImageNet classification dataset, but can be used for many other computer vision tasks. As a simple example, we provide code to fine-tune these classifiers to a new set of output labels.
- Tools to easily process standard image datasets, such as ImageNet, CIFAR10 and MNIST.
The release of the TF-Slim library and the pre-trained model zoo has been the result of widespread collaboration within Google Research. In particular we want to highlight the vital contributions of the following researchers:
- TF-Slim: Sergio Guadarrama, Nathan Silberman.
- Model Definitions and Checkpoints: Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Jon Shlens, Zbigniew Wojna, Vivek Rathod, George Papandreou, Alex Alemi
- Systems Infrastructure: Jon Shlens, Matthieu Devin, Martin Wicke
- Jupyter notebook: Nathan Silberman, Kevin Murphy
[1] Going deeper with convolutions, Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, CVPR 2015
[2] Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift Sergey Ioffe, Christian Szegedy, ICML 2015
[3] Rethinking the Inception Architecture for Computer Vision, Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, Zbigniew Wojna, arXiv technical report 2015
[4] Very Deep Convolutional Networks for Large-Scale Image Recognition, Karen Simonyan, Andrew Zisserman, ICLR 2015
[5] ImageNet Classification with Deep Convolutional Neural Networks, Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, NIPS 2012
[6] Deep Residual Learning for Image Recognition, Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, CVPR 2016