Translate & Fill: Improving Zero-Shot Multilingual Semantic Parsing by Generating Synthetic Data

Massimo Nicosia; Zhongdi Qu; Yasemin Altun

Translate & Fill: Improving Zero-Shot Multilingual Semantic Parsing by Generating Synthetic Data

Massimo Nicosia

Zhongdi Qu

Yasemin Altun

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (Findings), Association for Computational Linguistics (2021) (to appear)

Google Scholar

Abstract

While multilingual pretrained language models (LMs) fine-tuned on a single language have shown substantial cross-lingual task transfer capabilities, there is still a wide performance gap in semantic parsing tasks when target language supervision is available. In this paper, we propose a novel Translate-and-Fill (TaF) method for producing silver training data for a multilingual semantic parser. This method simplifies the popular Translate-Align-Project (TAP) pipeline and consists of a sequence-to-sequence filler model that constructs a full parse conditioned on an utterance and a view of the same parse. Our filler is trained on English data only but can accurately complete instances in other languages (i.e., translations of the English training utterances), in a zero-shot fashion. Experimental results on multiple multilingual semantic parsing datasets show that high-capacity multilingual pretrained LMs have remarkable zero-shot performance and with the help of our synthetic data, they reach competitive accuracy compared to similar systems which rely on traditional alignment techniques.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Translate & Fill: Improving Zero-Shot Multilingual Semantic Parsing by Generating Synthetic Data

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs