Unlocking Compositional Generalization in Pre-trained ModelsUsing Intermediate Representations

Ice Pasupat

Jonathan Herzig

Kelvin Guu

Ming-Wei Chang

Pete Shaw

Yuan Zhang

(2021)

Download Google Scholar

Abstract

Pre-trained seq2seq models are prevalent in semantic parsing, but have been found to struggle at out-of-distribution compositional generalization. In contrast, specialized model architectures have been proposed to address this issue, often at the cost of generality and in-distribution performance. In this paper, we propose a simple strategy to unlock compositionality of pre-trained seq2seq models through intermediate representations, without changing the model architectures at all. We identify several effective strategies for designing reversible and lossy intermediate representations that reduce the structural mismatch between inputs and outputs. We then apply either deterministic transformations or a second seq2seq to map the intermediate form to the original executable form. We find that the combination of our proposed transformations and pre-trained models is surprisingly effective, obtaining a new state-of-the-art on CFQ (+11.9 accuracy points) and on the template-splits of three text-to-SQL datasets (+15.0 to +19.4 accuracy points). This work highlights that intermediate representations provide an important (and potentially overlooked) degree of freedom for improving the compositional generalization abilities of pre-trained seq2seq models.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Unlocking Compositional Generalization in Pre-trained ModelsUsing Intermediate Representations

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Unlocking Compositional Generalization in Pre-trained ModelsUsing Intermediate Representations

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities