QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension

Adams Wei Yu; David Dohan; Thang Luong; Rui Zhao; Kai Chen; Quoc Le

QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension

Adams Wei Yu

David Dohan

Thang Luong

Rui Zhao

Kai Chen

Quoc Le

ICLR (2018)

Download Google Scholar

Abstract

Current end-to-end Q&A models are primarily based on recurrent neural networks with attention. Despite their success, these models are often slow for both training and inference. We propose a novel Q&A model that does not require recurrent networks yet achieves equivalent or better performance than existing models. Our model is simple in that it consists exclusively of attention and convolutions. We present a thorough study of architectural choices that improve the accuracy of this simple model.
We also propose a novel data augmentation technique that not only enhances the training examples but also diversifies the phrasing of the sentences. It results in immediate improvement in the accuracy. This technique is of independent interest that it can be readily applied to other natural language processing tasks.
On the SQuAD dataset, our model is 3x faster in training and 10x faster in inference. The model achieves 82.2 F1 score on the development set, which is on par with best documented result of 81.8.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs