Tim A. D. Henderson

Tim A. D. Henderson

Dr. Tim A. D. Henderson is a researcher and software engineer at Google. At Google he works on test automation, continuous integration, test case selection, test case prioritization, and fault localization. Tim received a PhD in Computer Science from Case Western Reserve University in Cleveland, Ohio in 2017. Other research interests include data mining, graph mining, software engineering, software testing, code clones, fault localization and program analysis. Publications, articles, and additional information can be found on his website: https://hackthology.com/
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Flake Aware Culprit Finding
    Eric Nickell
    Collin Johnston
    Avi Kondareddy
    Proceedings of the 16th IEEE International Conference on Software Testing, Verification and Validation (ICST 2023), IEEE (to appear)
    Preview abstract When a change introduces a bug into a large software repository, there is often a delay between when the change is committed and when bug is detected. This is true even when the bug causes an existing test to fail! These delays are caused by resource constraints which prevent the organization from running all of the tests on every change. Due to the delay, a Continuous Integration system needs to locate buggy commits. Locating them is complicated by flaky tests that pass and fail non-deterministically. The flaky tests introduce noise into the CI system requiring costly reruns to determine if a failure was caused by a bad code change or caused by non-deterministic test behavior. This paper presents an algorithm, Flake Aware Culprit Finding, that locates buggy commits more accurately than a traditional bisection search. The algorithm is based on Bayesian inference and noisy binary search, utilizing prior information about which changes are most likely to contain the bug. A large scale empirical study was conducted at Google on 13,000+ test breakages. The study evaluates the accuracy and cost of the new algorithm versus a traditional deflaked bisection search. View details
    What Breaks Google?
    Abhayendra Singh
    Avi Kondareddy
    (2023)
    Preview abstract In an ideal Continuous Integration (CI) workflow, all potentially impacted builds/tests would be run before submission for every proposed change. In large-scale environments like Google’s mono-repository, this is not feasible to do in terms of both latency and compute cost, given the frequency of change requests and overall size of the codebase. Instead, the compromise is to run more comprehensive testing at later stages of the development lifecycle – batched after submission. These runs detect newly broken tests. Automated culprit finders determine which change broke the tests, the culprit change. In order to make testing at Google more efficient and lower latency, TAP Postsubmit is developing a new scheduling algorithm that utilizes Bug Prediction metrics, features from the change, and historical information about the tests and targets to predict and rank targets by their likelihood to fail. This presentation examines the association between some of our selected features with culprits in Google3. View details
    Improving Fault Localization by Integrating Valueand Predicate Based Causal Inference Techniques
    Yigit Kucuk
    Andy Podgurski
    Proceedings of the International Conference on Software Engineering, IEEE/ACM (2021) (to appear)
    Preview abstract Statistical fault localization (SFL) techniques use execution profiles and success/failure information from software executions, in conjunction with statistical inference, to automatically score program elements based on how likely they are to be faulty. SFL techniques typically employ one type of profile data: either coverage data, predicate outcomes, or variable values. Most SFL techniques actually measure correlation, not causation, between profile values and success/failure, and so they are subject to confounding bias that distorts the scores they produce. This paper presents a new SFL technique, named UniVal, that uses causal inference techniques and machine learning to integrate information about both predicate outcomes and variable values to more accurately estimate the true failure-causing effect of program statements. UniVal was empirically compared to several coverage-based, predicate-based, and value-based SFL techniques on 800 program versions with real faults. View details
    Flake-Aware Culprit Finding
    Collin Johnston
    Eric Nickell
    CI/CD Industry Workshop (2021)
    Preview abstract Flaky, non-deterministic tests make culprit finding (or finding the version of code where a test started failing) in large scale repositories difficult. A naive binary search (or bisect) algorithm will be unreliable if the test being used for the bisection is flaky. If retries are conducted for each tested version the process becomes expensive. We propose a flake-aware culprit finding system which takes into account the prior flakiness of a test and uses a Bayesian probabilistic model to reduce the number of test executions needed to achieve accurate culprit finding faster and using fewer resources than binary search with retries. View details
    Preview abstract Trends in Very Large Scale Continuous Integration 1. What is CI 2. What is meant by "very large scale" 3. Goals in large scale continuous integration (which often compete with each other) 4. Challenges to meeting those goals 5. Categories of solutions 6. Trends View details