STaR: Self-Taught Reasoner Bootstrapping Reasoning With Reasoning

Eric Zelikman; Jesse Mu; Noah D. Goodman; Yuhuai Tony Wu

STaR: Self-Taught Reasoner Bootstrapping Reasoning With Reasoning

Eric Zelikman

Jesse Mu

Noah D. Goodman

Yuhuai Tony Wu

NeurIPS (2022) (to appear)

Google Scholar

Abstract

Generating step-by-step "chain-of-thought" rationales improves language model performance on complex reasoning tasks like mathematics or commonsense question-answering. However, inducing language model rationale generation currently requires either constructing massive rationale datasets or sacrificing accuracy by using only few-shot inference.
We propose a technique to iteratively leverage a small number of rationale examples and a large dataset without rationales, to bootstrap the ability to perform successively more complex reasoning.
This technique, the "Self-Taught Reasoner" (STaR), relies on a simple loop: generate rationales to answer many questions, prompted with a few rationale examples; if the generated answers are wrong,
try again to generate a rationale given the correct answer; fine-tune on all the rationales that ultimately yielded correct answers; repeat.
We show that STaR significantly improves performance on multiple datasets compared to a model fine-tuned to directly predict final answers, and performs comparably to fine-tuning a 30$\times$ larger state-of-the-art language model on CommensenseQA. Thus, STaR lets a model improve itself by learning from its own generated reasoning.

Research Areas

Machine intelligence

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

STaR: Self-Taught Reasoner Bootstrapping Reasoning With Reasoning

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs