- Charles Sutton
- Daniel Johnson
- Danny Tarlow
- David Bieber
- Kensen Shi
- Petros Maniatis
- Vincent Josua Hellendoorn
Abstract
Graph representations of programs are commonly a central element of machine learning for code research. We introduce an open source Python library python_graphs that applies static analysis to construct graph representations of Python programs suitable for training machine learning models. Our library admits the construction of control-flow graphs, data-flow graphs, and composite "program graphs" that combine control-flow, data-flow, syntactic, and lexical information about a program. We present the capabilities and limitations of the library, perform a case-study applying the library to millions of competitive programming submissions, and showcase the library's utility for machine learning research.
Research Areas
Learn more about how we do research
We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work