Google Research

The Open Reaction Database

  • Abigail G. Doyle
  • Anton Kast
  • Connor W. Coley
  • Joel M. Hawkins
  • Klavs F. Jensen
  • Michael R. Maser
  • Michael Wleklinski
  • Spencer D. Dreher


Chemical reaction data in journal articles, patents, and even electronic laboratory notebooks are currently stored in various formats, often unstructured, which presents a significant barrier to downstream applications, including the training of machine learning models. We present the Open Reaction Database (ORD), an open access schema and infrastructure for structuring and sharing organic reaction data, including a centralized data repository. The ORD schema supports conventional and emerging technologies, from benchtop reactions to automated high-throughput experiments and flow chemistry. The data, schema, supporting code, and web-based user interfaces are all publicly available on GitHub. Our vision is that a consistent data representation and infrastructure to support data sharing will enable downstream applications that will greatly improve the state of the art with respect to computer-aided synthesis planning, reaction prediction, and other predictive chemistry tasks.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work