Pixel RNN

Aäron van den Oord; Nal Kalchbrenner

Pixel RNN

Aäron van den Oord

Nal Kalchbrenner

ICML (2016)

Google Scholar

Abstract

Modelling the distribution of natural images is a landmark problem in unsupervised learning.
We train a deep recurrent neural network to sequentially predict the pixels in an image. The network models the discrete joint probability of the raw pixel values. The distribution, though formally simple, can be arbitrarily complex and multimodal. The distribution is tractable and its ability to generalize is readily measured.
Within a pixel the colors are also predicted sequentially and depend on each other and the previous context. We design two types of parallel spatial LSTM layers to make the network fast and scalable.
Our main result is a compression score of 3.00 bits per color on CIFAR-10, which is considerably better than previous art. We also set new benchmarks on 32 x 32 and 64 x 64 ImageNet. Samples generated from the ImageNet model turn out general, sharp and globally coherent.

Research Areas

Machine intelligence

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Pixel RNN

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs