Fast Decoding in Sequence Models Using Discrete Latent Variables

Lukasz Kaiser

Aurko Roy

Ashish Vaswani

Niki J. Parmar

Samy Bengio

Jakob Uszkoreit

Noam Shazeer

ICML (2018)

Download Google Scholar

Abstract

Auto-regressive sequence models based on deep neural networks, such as RNNs, Wavenet and Transformer are the state of the art on many tasks. However, they lack parallelism and are thus slow for long sequences. RNNs lack parallelism both during training and decoding, while architectures like WaveNet and Transformer are much more parallel during training, but still lack parallelism during decoding. We present a method to extend sequence models using discrete latent variables that makes decoding much more parallel. The main idea behind this approach is to first autoencode the target sequence into a shorter discrete latent sequence, which is generated auto-regressively, and finally decode the full sequence from this shorter latent sequence in a parallel manner. We verify that our method works on the task of neural machine translation, where our models are an order of magnitude faster than comparable auto-regressive models. We also introduce a new method for constructing discrete latent variables that allows us to obtain good BLEU scores.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Fast Decoding in Sequence Models Using Discrete Latent Variables

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Fast Decoding in Sequence Models Using Discrete Latent Variables

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities