De-Flake Your Tests: Automatically Locating Root Causes of Flaky Tests in Code At Google

Diego Cavalcanti
International Conference on Software Maintenance and Evolution (ICSME) 2020, IEEE


Regression testing is a critical part of software development and maintenance. It ensures that modifications to existing software do not break existing behavior and functionality. One of the key assumptions about regression tests is that their results are deterministic: when executed without any modifications with the same configuration, either they always fail or they always pass. In practice, however, there exist tests that are non-deterministic, called flaky tests. Flaky tests cause the results of test runs to be unreliable, and they disrupt the software development workflow. In this paper, we present a novel technique to automatically identify the locations of the root causes of flaky tests on the code level to help developers debug and fix them. We study the technique on flaky tests across 428 projects at Google. Based on our case studies, the technique helps identify the location of the root causes of flakiness with 82% accuracy. Furthermore, our studies show that integration into the appropriate developer workflows, simplicity of debugging aides and fully automated fixes are crucial and preferred components for adoption and usability of flakiness debugging and fixing tools.

Research Areas