Can Large Language Models Reason About Program Invariants?

Charles Sutton

David Bieber

Kensen Shi

Kexin Pei

Pengcheng Yin

ICML (2023) (to appear)

Google Scholar

Abstract

Identifying invariants in programs is an important program analysis task with applications towards program understanding, vulnerability analysis, and formal verification. Existing tools for identifying invariants rely on dynamic analysis, requiring traces collected from multiple executions in order to produce reliable invariants. We study the application of large language models to invariant prediction, finding that models training on source code and fine-tuned to invariant prediction can perform invariant prediction as static rather than dynamic analysis. Using a scratchpad approach gives the best performance, finding invariants statically of quality comparable to those obtained by a dynamic analysis tool with access to five program traces.

Research Areas

Machine intelligence
Software systems

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Can Large Language Models Reason About Program Invariants?

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs