Caitlin Sadowski

Caitlin Sadowski

Dr. Caitlin Sadowski is a senior engineering manager at Google in San Francisco, California, where she leads the Chrome Data team: a cross-functional team of software engineers and data scientists focused on tools, infrastructure, and insights related to measuring Chrome browser usage. In the past, she made static analysis useful at Google by creating the Tricorder program analysis platform and co-founded a team focused on understanding developer productivity. She has a PhD from the University of California at Santa Cruz where she worked on a variety of research topics related to programming languages, software engineering, and human computer interaction. She enjoys baking with her two young children.
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    A World Wide View of Browsing the World Wide Web
    Aurore Fass
    Emma Thomas
    Jon Azose
    Kimberly Ruth
    Mark Pearson
    Zakir Durumeric
    Internet Measurement Conference (IMC), ACM(2022) (to appear)
    Preview abstract In this paper, we perform the first large-scale study of how people use and spend time on the web. Our study is based on anonymous, aggregated telemetry data from a major web browser with several hundred million users globally. Our study considers only users who chose to explicitly enable sharing URLs with the browser vendor and have usage statistic reporting enabled. We analyze the distribution of web traffic, the types of popular sites that people visit and spend the most time on, differences between desktop and mobile browsing behavior, geographical differences in web usage, and the websites popular in regions worldwide. Our study sheds light on both online user behavior and how the measurement community can better analyze the web in future research studies. View details
    Enabling the Study of Software Development Behavior with Cross-Tool Logs
    Collin Green
    Ben Holtz
    Edward K. Smith
    Andrea Marie Knight Dolan
    Elizabeth Kammer
    Jillian Dicker
    James Lin
    Lan Cheng
    Emerson Murphy-Hill
    IEEE Software, Special Issue on Behavioral Science of Software Engineering(2020)
    Preview abstract Understanding developers’ day-to-day behavior can help answer important research questions, but capturing that behavior at scale can be challenging, particularly when developers use many tools in concert to accomplish their tasks. In this paper, we describe our experience creating a system that integrates log data from dozens of development tools at Google, including tools that developers use to email, schedule meetings, ask and answer technical questions, find code, build and test, and review code. The contribution of this article is a technical description of the system, a validation of it, and a demonstration of its usefulness. View details
    Do Developers Learn New Tools On The Toilet?
    Emerson Murphy-Hill
    Edward K. Smith
    Andrea Knight Dolan
    Andrew Trenk
    Steve Gross
    Proceedings of the 2019 International Conference on Software Engineering
    Preview abstract Maintaining awareness of useful tools is a substantial challenge for developers. Physical newsletters are a simple technique to inform developers about tools. In this paper, we evaluate such a technique, called Testing on the Toilet, by performing a mixed-methods case study. We first quantitatively evaluate how effective this technique is by applying statistical causal inference over six years of data about tools used by thousands of developers. We then qualitatively contextualize these results by interviewing and surveying 382 developers, from authors to editors to readers. We found that the technique was generally effective at increasing software development tool use, although the increase varied depending on factors such as the breadth of applicability of the tool, the extent to which the tool has reached saturation, and the memorability of the tool name. View details
    What Predicts Software Developers’ Productivity?
    Emerson Murphy-Hill
    David C. Shepherd
    Michael Phillips
    Andrea Knight Dolan
    Edward K. Smith
    Transactions on Software Engineering(2019)
    Preview abstract Organizations have a variety of options to help their software developers become their most productive selves, from modifying office layouts, to investing in better tools, to cleaning up the source code. But which options will have the biggest impact? Drawing from the literature in software engineering and industrial/organizational psychology to identify factors that correlate with productivity, we designed a survey that asked 622 developers across 3 companies about these productivity factors and about self-rated productivity. Our results suggest that the factors that most strongly correlate with self-rated productivity were non-technical factors, such as job enthusiasm, peer support for new ideas, and receiving useful feedback about job performance. Compared to other knowledge workers, our results also suggest that software developers’ self-rated productivity is more strongly related to task variety and ability to work remotely. View details
    Web Feature Deprecation: A Case Study for Chrome
    Ariana Mirian
    Geoffrey M. Voelker
    Nik Bhagat
    Stefan Savage
    International Conference on Software Engineering (ICSE) SEIP track(2019) (to appear)
    Preview abstract Deprecation is a necessary function for the health and innovation of the web ecosystem. However, web feature deprecation is an understudied topic. While Chrome has a protocol for web feature deprecation, much of this process is based on a mix of few metrics and intuition. In this paper, we analyze web feature deprecations, in an attempt to improve this process. First, we produce a taxonomy of reasons why developers want to deprecate web features. We then provide a set of guidelines for deciding when it is safe to deprecate a web feature and a methodology for approaching the question of whether to deprecate a web feature. Finally, we provide a tool that helps determine whether a web feature meets these guidelines for deprecation. We also discuss the challenges faced during this process. View details
    When Not to Comment: Questions and Tradeoffs with API Documentation for C++ Projects
    Andrew Head
    Emerson Murphy-Hill
    Andrea Knight
    International Conference on Software Engineering (ICSE)(2018) (to appear)
    Preview abstract Without usable and accurate documentation of how to use an API, programmers can find themselves deterred from reusing relevant code. In C++, one place developers can find documentation is in a header file, but when information is missing, they may look at the corresponding implementation (the “.cc” file). To understand what’s missing from C++ API documentation and whether it should be fixed, we conducted a mixed-methods study. This involved three experience sampling studies with hundreds of developers at the moment they visited implementation code, interviews with 18 of those developers, and interviews with 8 API maintainers. We found that in many cases, updating documentation may provide only limited value for developers, while requiring effort maintainers don’t want to invest. This helps frame future tools and processes designed to fill in missing low-level API documentation. View details
    Lessons from Building Static Analysis Tools at Google
    Edward Aftandilian
    Alex Eagle
    Liam Miller-Cushon
    Communications of the ACM (CACM), 61 Issue 4(2018), pp. 58-66
    Preview abstract In this article, we describe how we have applied the lessons from Google’s previous experience with FindBugs Java analysis, as well as lessons from the academic literature, to build a successful static analysis infrastructure that is used daily by the majority of engineers at Google. Our tooling detects thousands of issues per day that are fixed by engineers, by their own choice, before the problematic code is checked into the codebase. View details
    Modern Code Review: A Case Study at Google
    Emma Söderberg
    Luke Church
    Michal Sipko
    Alberto Bacchelli
    International Conference on Software Engineering, Software Engineering in Practice track (ICSE SEIP)(2018) (to appear)
    Preview abstract Employing lightweight, tool-based code review of code changes (aka modern code review) has become the norm for a wide variety of open-source and industrial systems. In this paper, we make an exploratory investigation of modern code review at Google. Google introduced code review early on and evolved it over the years; our study sheds light on why Google introduced this practice and analyzes its current status, after the process has been refined through decades of code changes and millions of code reviews. By means of 12 interviews, a survey with 44 respondents, and the analysis of review logs for 9 million reviewed changes, we investigate motivations behind code review at Google, current practices, and developers’ satisfaction and challenges. View details
    Discovering API Usability Problems at Scale
    Emerson Murphy-Hill
    Andrew Head
    International Workshop on API Usage and Evolution (WAPI)(2018)
    Preview abstract Software developers’ productivity can be negatively impacted by using APIs incorrectly. In this paper, we describe an analysis technique we designed to find API usability problems by comparing successive file-level changes made by individual software developers. We applied our tool, StopMotion, to the file histories of real developers doing real tasks at Google. The results reveal several API usability challenges including simple typos, conceptual API misalignments, and conflation of similar APIs. View details
    Advantages and Disadvantages of a Monolithic Codebase
    Andrea Knight
    Edward K. Smith
    Emerson Murphy-Hill
    International Conference on Software Engineering, Software Engineering in Practice track (ICSE SEIP)(2018)
    Preview abstract Monolithic source code repositories (repos) are used by several large tech companies, but little is known about their advantages or disadvantages compared to multiple per-project repos. This paper investigates the relative tradeoffs by utilizing a mixed-methods approach. Our primary contribution is a survey of engineers who have experience with both monolithic repos and multiple, per-project repos. This paper also backs up the claims made by these engineers with a large-scale analysis of developer tool logs. Our study finds that the visibility of the codebase is a significant advantage of a monolithic repo: it enables engineers to discover APIs to reuse, find examples for using an API, and automatically have dependent code updated as an API migrates to a new version. Engineers also appreciate the centralization of dependency management in the repo. In contrast, multiple-repository (multi-repo) systems afford engineers more flexibility to select their own toolchains and provide significant access control and stability benefits. In both cases, the related tooling is also a significant factor; engineers favor particular tools and are drawn to repo management systems that support their desired toolchain. View details