Reliable Data Processing with Minimal Toil

Athena Vawda; Betsy (Adrienne Elizabeth) Beyer; John James Lunney; Julia Lee; Pieter Coucke; Rich Feit; Rita Sodt

Reliable Data Processing with Minimal Toil

Athena Vawda

Betsy (Adrienne Elizabeth) Beyer

John James Lunney

Julia Lee

Pieter Coucke

Rich Feit

Rita Sodt

(2021)

Google Scholar

Abstract

This paper discusses an approach for making data pipelines both safer and less manual. We detail how we applied well known reliability best practices from user-facing services to batch jobs that underpin many of the services that make up Google Workspace. Using validation steps, canarying, and target populations for data pipelines, we ensure that only stable versions are promoted to the next environment stage. By moving to a single, standardized platform we minimized duplicate effort across services. We also touch on how we optimized batch jobs for both correctness and freshness SLOs, and the benefits of batch jobs vs. async event-based processing.

Research Areas

Software systems

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Reliable Data Processing with Minimal Toil

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs