Google Research

Autoformalization with Large Language Models

  • Albert Jiang
  • Charles Edgar Staats
  • Christian Szegedy
  • Markus Rabe
  • Mateja Jamnik
  • Wenda Li
  • Yuhuai Tony Wu
NeurIPS (2022) (to appear)

Abstract

Autoformalization is the process of automatically translating from natural language mathematics to formal specifications and proofs --- a task that is similar to software synthesis. Here, we study the ability of large language models to produce correct formalization of informally given statements. These models are trained using a (next-token-prediction) language modeling objective on a large web corpus containing both natural language and formal (including programming) languages. We use few-shot prompting of large language models and find that they can correctly translate a significant portion of mathematical competition problems to formal specifications in Isabelle/HOL. We demonstrate the usefulness of this process by improving a previously introduced neural theorem prover for Isabelle via training on these autoformalized theorems. Our methodology results in state-of-the-art on the miniF2F theorem proving benchmark, improving the proof rate from $29.6\%$ to $35.2\%$

Research Areas

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work