![Caitlin Sadowski](https://storage.googleapis.com/gweb-research2023-media/pubtools/247.png)
Caitlin Sadowski
Dr. Caitlin Sadowski is a senior engineering manager at Google in San Francisco, California, where she leads the Chrome Data team: a cross-functional team of software engineers and data scientists focused on tools, infrastructure, and insights related to measuring Chrome browser usage. In the past, she made static analysis useful at Google by creating the Tricorder program analysis platform and co-founded a team focused on understanding developer productivity. She has a PhD from the University of California at Santa Cruz where she worked on a variety of research topics related to programming languages, software engineering, and human computer interaction. She enjoys baking with her two young children.
Authored Publications
Sort By
A World Wide View of Browsing the World Wide Web
Aurore Fass
Emma Thomas
Jon Azose
Kimberly Ruth
Mark Pearson
Zakir Durumeric
Internet Measurement Conference (IMC), ACM(2022) (to appear)
Preview abstract
In this paper, we perform the first large-scale study of how people use and spend time on the web. Our study is based on anonymous, aggregated telemetry data from a major web browser with several hundred million users globally. Our study considers only users who chose to explicitly enable sharing URLs with the browser vendor and have usage statistic reporting enabled. We analyze the distribution of web traffic, the types of popular sites that people visit and spend the most time on, differences between desktop and mobile browsing behavior, geographical differences in web usage, and the websites popular in regions worldwide. Our study sheds light on both online user behavior and how the measurement community can better analyze the web in future research studies.
View details
Enabling the Study of Software Development Behavior with Cross-Tool Logs
Collin Green
Ben Holtz
Edward K. Smith
Andrea Marie Knight Dolan
Elizabeth Kammer
Jillian Dicker
James Lin
Lan Cheng
Emerson Murphy-Hill
IEEE Software, Special Issue on Behavioral Science of Software Engineering(2020)
Preview abstract
Understanding developers’ day-to-day behavior can help answer important research questions, but capturing that behavior at scale can be challenging, particularly when developers use many tools in concert to accomplish their tasks. In this paper, we describe our experience creating a system that integrates log data from dozens of development tools at Google, including tools that developers use to email, schedule meetings, ask and answer technical questions, find code, build and test, and review code. The contribution of this article is a technical description of the system, a validation of it, and a demonstration of its usefulness.
View details
Do Developers Learn New Tools On The Toilet?
Emerson Murphy-Hill
Edward K. Smith
Andrea Knight Dolan
Andrew Trenk
Steve Gross
Proceedings of the 2019 International Conference on Software Engineering
Preview abstract
Maintaining awareness of useful tools is a substantial challenge for developers. Physical newsletters are a simple technique to inform developers about tools. In this paper, we evaluate such a technique, called Testing on the Toilet, by performing a mixed-methods case study. We first quantitatively evaluate how effective this technique is by applying statistical causal inference over six years of data about tools used by thousands of developers. We then qualitatively contextualize these results by interviewing and surveying 382 developers, from authors to editors to readers. We found that the technique was generally effective at increasing software development tool use, although the increase varied depending on factors such as the breadth of applicability of the tool, the extent to which the tool has reached saturation, and the memorability of the tool name.
View details
What Predicts Software Developers’ Productivity?
Emerson Murphy-Hill
David C. Shepherd
Michael Phillips
Andrea Knight Dolan
Edward K. Smith
Transactions on Software Engineering(2019)
Preview abstract
Organizations have a variety of options to help their software developers become their most productive selves, from modifying office layouts, to investing in better tools, to cleaning up the source code. But which options will have the biggest impact? Drawing from the literature in software engineering and industrial/organizational psychology to identify factors that correlate with productivity, we designed a survey that asked 622 developers across 3 companies about these productivity factors and about self-rated productivity. Our results suggest that the factors that most strongly correlate with self-rated productivity were non-technical factors, such as job enthusiasm, peer support for new ideas, and receiving useful feedback about job performance. Compared to other knowledge workers, our results also suggest that software developers’ self-rated productivity is more strongly related to task variety and ability to work remotely.
View details
Web Feature Deprecation: A Case Study for Chrome
Ariana Mirian
Geoffrey M. Voelker
Nik Bhagat
Stefan Savage
International Conference on Software Engineering (ICSE) SEIP track(2019) (to appear)
Preview abstract
Deprecation is a necessary function for the health and innovation of the web ecosystem. However, web feature deprecation is an understudied topic. While Chrome has a protocol for web feature deprecation, much of this process is based on a mix of few metrics and intuition. In this paper, we analyze web feature deprecations, in an attempt to improve this process. First, we produce a taxonomy of reasons why developers want to deprecate web features. We then provide a set of guidelines for deciding when it is safe to deprecate a web feature and a methodology for approaching the question of whether to deprecate a web feature. Finally, we provide a tool that helps determine whether a web feature meets these guidelines for deprecation. We also discuss the challenges faced during this process.
View details
When Not to Comment: Questions and Tradeoffs with API Documentation for C++ Projects
Andrew Head
Emerson Murphy-Hill
Andrea Knight
International Conference on Software Engineering (ICSE)(2018) (to appear)
Preview abstract
Without usable and accurate documentation of how to use an API,
programmers can find themselves deterred from reusing relevant
code. In C++, one place developers can find documentation is in
a header file, but when information is missing, they may look at
the corresponding implementation (the “.cc” file). To understand
what’s missing from C++ API documentation and whether it should
be fixed, we conducted a mixed-methods study. This involved three
experience sampling studies with hundreds of developers at the
moment they visited implementation code, interviews with 18 of
those developers, and interviews with 8 API maintainers. We found
that in many cases, updating documentation may provide only
limited value for developers, while requiring effort maintainers
don’t want to invest. This helps frame future tools and processes
designed to fill in missing low-level API documentation.
View details
Lessons from Building Static Analysis Tools at Google
Edward Aftandilian
Alex Eagle
Liam Miller-Cushon
Communications of the ACM (CACM), 61 Issue 4(2018), pp. 58-66
Preview abstract
In this article, we describe how we have applied the lessons
from Google’s previous experience with FindBugs Java analysis,
as well as lessons from the academic literature, to build
a successful static analysis infrastructure that is used daily
by the majority of engineers at Google. Our tooling detects
thousands of issues per day that are fixed by engineers, by
their own choice, before the problematic code is checked into
the codebase.
View details
Modern Code Review: A Case Study at Google
Emma Söderberg
Luke Church
Michal Sipko
Alberto Bacchelli
International Conference on Software Engineering, Software Engineering in Practice track (ICSE SEIP)(2018) (to appear)
Preview abstract
Employing lightweight, tool-based code review of code changes
(aka modern code review) has become the norm for a wide
variety of open-source and industrial systems. In this paper,
we make an exploratory investigation of modern code
review at Google. Google introduced code review early on
and evolved it over the years; our study sheds light on why
Google introduced this practice and analyzes its current
status, after the process has been refined through decades of
code changes and millions of code reviews. By means of 12
interviews, a survey with 44 respondents, and the analysis
of review logs for 9 million reviewed changes, we investigate
motivations behind code review at Google, current practices,
and developers’ satisfaction and challenges.
View details
Discovering API Usability Problems at Scale
Emerson Murphy-Hill
Andrew Head
International Workshop on API Usage and Evolution (WAPI)(2018)
Preview abstract
Software developers’ productivity can be negatively impacted by using APIs incorrectly. In this paper, we describe an analysis technique we designed to find API usability problems by comparing successive file-level changes made by individual software developers. We applied our tool, StopMotion, to the file histories of real developers doing real tasks at Google. The results reveal several API usability challenges including simple typos, conceptual API misalignments, and conflation of similar APIs.
View details
Advantages and Disadvantages of a Monolithic Codebase
Andrea Knight
Edward K. Smith
Emerson Murphy-Hill
International Conference on Software Engineering, Software Engineering in Practice track (ICSE SEIP)(2018)
Preview abstract
Monolithic source code repositories (repos) are used by several large tech companies, but little is known about their advantages or disadvantages compared to multiple per-project repos. This paper investigates the relative tradeoffs by utilizing a mixed-methods approach. Our primary contribution is a survey of engineers who have experience with both monolithic repos and multiple, per-project repos. This paper also backs up the claims made by these engineers with a large-scale analysis of developer tool logs. Our study finds that the visibility of the codebase is a significant advantage of a monolithic repo: it enables engineers to discover APIs to reuse, find examples for using an API, and automatically have dependent code updated as an API migrates to a new version. Engineers also appreciate the centralization of dependency management in the repo. In contrast, multiple-repository (multi-repo) systems afford engineers more flexibility to select their own toolchains and provide significant access control and stability benefits. In both cases, the related tooling is also a significant factor; engineers favor particular tools and are drawn to repo management systems that support their desired toolchain.
View details