Google Research

SGD-X: A Benchmark for Robust Generalization in Schema-Guided Dialogue Systems

AAAI Conference on Artificial Intelligence, Association for the Advancement of Artificial Intelligence (2022) (to appear)


Enabling zero/few-shot transfer to unseen services is a critical challenge in task-oriented dialogue systems research. The Schema-Guided Dialogue (SGD) dataset introduced a novel paradigm for zero-shot transfer through introducing natural language representations to describe the schema elements i.e. intents and slots supported by APIs, which models use to understand the APIs they need to interact with. However, the impact of the choice of language for these descriptions on models' performance remains unexplored. To address this, we release SGD-X, a benchmark for studying the robustness of schema-guided dialogue systems to linguistic variations in schemas. SGD-X augments the original SGD dataset with 5 crowdsourced paraphrases for each schema element name and description, where the paraphrases are semantically similar yet stylistically diverse. We evaluate, using a novel metric for measuring robustness to schemas, two schema-guided dialogue state tracking models on SGD-X and observe a significant drop in joint goal accuracy across schema variations, demonstrating that models can be sensitive to the choice of language used in schemas. Furthermore, we present a simple data augmentation method to improve robustness to linguistic variation in the schemas. Our work introduces a novel challenge to encourage robust dialog modeling, for better transfer learning.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work