Emergent abilities of large language models

Colin Raffel
Percy Liang
Tatsunori B. Hashimoto
Liam B. Fedus
Maarten Paul Bosma
Yi Tay
Jason Wei
Barret Zoph
Sebastian Borgeaud
Dani Yogatama
TMLR (2022)

Abstract

Scaling up language models has been shown to predictably confer a range of benefits such as improved performance and sample efficiency. This paper discusses an unpredictable phenomenon that we call emergent abilities of large language models. Such emergent abilities have close to random performance until evaluated on a model of sufficiently large scale, and hence their emergence cannot be predicted by extrapolating a scaling law based on small-scale models. The emergence of such abilities suggests that additional scaling could further expand the range of tasks that language models can perform. We discuss the implications of these phenomena and suggest directions for future research.