Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Adam Roberts; Colin Raffel; Katherine Lee; Michael Matena; Noam Shazeer; Peter J. Liu; Sharan Narang; Wei Li; Yanqi Zhou

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Adam Roberts

Colin Raffel

Katherine Lee

Michael Matena

Noam Shazeer

Peter J. Liu

Sharan Narang

Wei Li

Yanqi Zhou

Google (2019)

Google Scholar

Abstract

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a lower-resource downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning for NLP by introducing a unified framework which casts every language problem as a text-to-text task. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of text understanding tasks. By combining the insights gained in our exploration with scale and a new giant unlabeled text dataset, we achieve state-of-the-art results in most of the tasks we consider. To facilitate future work on text understanding, we release our dataset, pre-trained models, and code.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs