Kelvin Guu

Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Finetuned Language Models are Zero-Shot Learners
    Jason Wei
    Maarten Paul Bosma
    Vincent Zhao
    Nan Du
    International Conference on Learning Representations (2022)
    Preview abstract This paper explores a simple method for improving the zero-shot learning abilities of language models. We show that instruction tuning---finetuning language models on a collection of tasks described via instructions---substantially boosts zero-shot performance on unseen tasks. We take a 137B parameter pretrained language model and instruction-tune it on over 60 NLP tasks verbalized via natural language instruction templates. We evaluate this instruction-tuned model, which we call FLAN, on unseen task types. FLAN substantially improves the performance of its unmodified counterpart and surpasses zero-shot 175B GPT-3 on 20 of 25 tasks that we evaluate. FLAN even outperforms few-shot GPT-3 by a large margin on ANLI, RTE, BoolQ, AI2-ARC, OpenbookQA, and StoryCloze. Ablation studies reveal that number of tasks and model scale are key components to the success of instruction tuning. View details
    Dialog Inpainting: Turning Documents into Dialogs
    Aida Amini
    Arun Tejasvi Chaganty
    Mike Green
    Vincent Zhao
    Zhuyun Dai
    arXiv preprint (2022)
    Preview abstract Many important questions (e.g. "How to eat healthier?") require conversation to establish context and explore in depth. However, conversational question answering (ConvQA) systems have long been stymied by scarce training data that is expensive to collect. To address this problem, we propose a new technique for synthetically generating diverse and high-quality dialog data: dialog inpainting. Our approach takes the text of any document and transforms it into a two-person dialog between the writer and an imagined reader: we treat sentences from the article as utterances spoken by the writer, and then use a dialog inpainter to predict what the imagined reader asked or said in between each of the writer's utterances. By applying this approach to passages from Wikipedia and the web, we produce WikiDialog and WebDialog, two datasets totalling 19 million diverse information-seeking dialogs -- 1,000x larger than the largest existing ConvQA dataset. Furthermore, human raters judge the answer adequacy and conversationality of WikiDialog to be as good or better than existing manually-collected datasets. Using our inpainted data to pre-train ConvQA retrieval systems, we significantly advance state-of-the-art across three benchmarks (QReCC, OR-QuAC, TREC CAsT) yielding up to 40% relative gains on standard evaluation metrics. View details
    Preview abstract In practical applications of semantic parsing, we occasionally want to control the behavior of the parser, such as making it output meaning representations in a new domain, or influencing the prediction on some queries toward certain patterns. While it is possible to fine-tune the parser on examples exhibiting the target behavior, a method that does not consume as much time or computation resources would be preferable. To this end, we propose retrieval-augmented generative semantic parser (RAG-SP): given the input query, the parser retrieves relevant information from the retrieval index, augment it to the query, and then apply a generative model to produce an output. The augmented information acts as a soft influence on the generative model, and by manipulating the retrieval index or how the augmented query is constructed, we can manipulate the behavior of the parser. On the MTOP dataset, in addition to achieving state-of-the-art on the standard setup, we show that RAG-SP can parse queries in a new domain or adapt the prediction toward the specified patterns without having to fine-tune the model. With some modifications, RAG-SP also performs well on the episodic few-shot setup on the SNIPS slot tagging dataset. View details
    Preview abstract Pre-trained seq2seq models are prevalent in semantic parsing, but have been found to struggle at out-of-distribution compositional generalization. In contrast, specialized model architectures have been proposed to address this issue, often at the cost of generality and in-distribution performance. In this paper, we propose a simple strategy to unlock compositionality of pre-trained seq2seq models through intermediate representations, without changing the model architectures at all. We identify several effective strategies for designing reversible and lossy intermediate representations that reduce the structural mismatch between inputs and outputs. We then apply either deterministic transformations or a second seq2seq to map the intermediate form to the original executable form. We find that the combination of our proposed transformations and pre-trained models is surprisingly effective, obtaining a new state-of-the-art on CFQ (+11.9 accuracy points) and on the template-splits of three text-to-SQL datasets (+15.0 to +19.4 accuracy points). This work highlights that intermediate representations provide an important (and potentially overlooked) degree of freedom for improving the compositional generalization abilities of pre-trained seq2seq models. View details
    Retrieval Augmented Language Model Pre-Training
    Zora Tung
    Panupong Pasupat
    Ming-Wei Chang
    Proceedings of the 37th International Conference on Machine Learning (2020) (to appear)
    Preview abstract Language model pre-training has been shown to capture a surprising amount of world knowledge, crucial for NLP tasks such as question answering. However, this knowledge is stored implicitly in the parameters of a neural network, requiring ever-larger networks to cover more facts. To capture knowledge in a more modular and interpretable way, we augment language model pre-training with a latent knowledge retriever, which allows the model to retrieve and attend over documents from a large corpus such as Wikipedia, used during pre-training, fine-tuning and inference. For the first time, we show how to pre-train such a knowledge retriever in an unsupervised manner, using masked language modeling as the learning signal and backpropagating through a retrieval step that considers millions of documents. We demonstrate the effectiveness of Retrieval-Augmented Language Model pre-training (REALM) by fine-tuning on the challenging task of Open-domain Question Answering (Open-QA). We compare against state-of-the-art models for both explicit and implicit knowledge storage on three popular Open-QA benchmarks, and find that we outperform all previous methods by a significant margin (4-16% absolute accuracy), while also providing qualitative benefits such as interpretability and modularity. View details