Google Research

QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension

ICLR (2018)

Abstract

Current end-to-end Q&A models are primarily based on recurrent neural networks with attention. Despite their success, these models are often slow for both training and inference. We propose a novel Q&A model that does not require recurrent networks yet achieves equivalent or better performance than existing models. Our model is simple in that it consists exclusively of attention and convolutions. We present a thorough study of architectural choices that improve the accuracy of this simple model. We also propose a novel data augmentation technique that not only enhances the training examples but also diversifies the phrasing of the sentences. It results in immediate improvement in the accuracy. This technique is of independent interest that it can be readily applied to other natural language processing tasks. On the SQuAD dataset, our model is 3x faster in training and 10x faster in inference. The model achieves 82.2 F1 score on the development set, which is on par with best documented result of 81.8.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work