Taming Google-Scale Continuous Testing

Atif Memon

Bao Nguyen

Eric Nickell

John Micco

Sanjeev Dhanda

Rob Siemborski

Zebao Gao

ICSE '17:Proceedings of the 39th International Conference on Software Engineering (2017) (to appear)

Google Scholar

Abstract

Growth in Google’s code size and feature churn rate has seen increased reliance on continuous integration (CI) and testing to maintain quality. Even with enormous resources dedicated to testing, we are unable to regression test each code change individually, resulting in increased lag time between code check-ins and test result feedback to developers. We report results of a project that aims to reduce this time by: (1) controlling test workload without compromising quality, and (2) distilling test results data to inform developers, while they write code, of the impact of their latest changes on quality. We model, empirically understand, and leverage the correlations that exist between our code, test cases, developers, programming languages, and code-change and test-execution frequencies, to improve our CI and development processes. Our findings show: very few of our tests ever fail, but those that do are generally “closer” to the code they test; certain frequently modified code and certain users/tools cause more breakages; and code recently modified by multiple developers (more than 3) breaks more often.

NOTE: You can find the anonymized dataset for our paper on Google drive: https://drive.google.com/open?id=0B5_QHWCtac81VGNKYnhrQkJBZGM

Research Areas

Software engineering

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Taming Google-Scale Continuous Testing

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs