Collin Green
Collin is the User Experience Research lead and manager of the Engineering Productivity Research team within Developer Intelligence. The Engineering Productivity Research team brings a data-driven approach to business decisions around engineering productivity. They use a combination of qualitative and quantitative methods to triangulate on measuring productivity. Collin received his Ph.D. in Cognitive Psychology from the University of California-Los Angeles.
Research Areas
Authored Publications
Sort By
Preview abstract
AI-powered software development tooling is changing the way that developers interact with tools and write code. However, the ability for AI to truly transform software development depends on developers' level of trust in the tools. In this work, we take a mixed methods approach to measuring the factors that influence developers' trust in AI-powered code completion. We identified that familiarity with AI suggestions, quality of the suggestion, and level of expertise with the language all increased acceptance rate of AI-powered suggestions. While suggestion length and presence in a test file decreased acceptance rates. Based on these findings we propose recommendations for the design of AI-powered development tools to improve trust.
View details
Preview abstract
This is the seventh installment of the Developer Productivity for Humans column. This installment focuses on software quality: what it means, how developers see it, how we break it down into 4 types of quality, and the impact these have on each other.
View details
Measuring Developer Experience with a Longitudinal Survey
Jessica Lin
Jill Dicker
IEEE Software (2024)
Preview abstract
At Google, we’ve been running a quarterly large-scale survey with developers since 2018. In this article, we will discuss how we run EngSat, some of our key learnings over the past 6 years, and how we’ve evolved our approach to meet new needs and challenges.
View details
Preview abstract
Measuring the productivity of software developers is inherently difficult; it requires measuring humans doing a complex, creative task. They are affected by both technological and sociological aspects of their job, and these need to be evaluated in concert to deeply understand developer productivity.
View details
Using Logs Data to Identify When Software Engineers Experience Flow or Focused Work
Ben Holtz
ACM CHI Conference on Human Factors in Computing Systems (2023) (to appear)
Preview abstract
Beyond self-report data, we lack reliable and non-intrusive methods for identifying flow. However, taking a step back and acknowledging that flow occurs during periods of focus gives us the opportunity to make progress towards measuring flow by isolating focused work. Here, we take a mixed-methods approach to design a logs based metric that leverages machine learning and a comprehensive collection of logs data to identify periods of related actions (indicating focus), and validate this metric against self-reported time in focus or flow using diary data and quarterly survey data. Our results indicate that we can determine when software engineers at a large technology company experience focused work which includes instances of flow. This metric speaks to engineering work, but can be leveraged in other domains to non-disruptively measure when people experience focus. Future research can build upon this work to identify signals associated with other facets of flow.
View details
Systemic Gender Inequities in Who Reviews Code
Emerson Murphy-Hill
Jill Dicker
Amber Horvath
Laurie R. Weingart
Nina Chen
Computer Supported Cooperative Work (2023) (to appear)
Preview abstract
Code review is an essential task for modern software engineers, where the author of a code change assigns other engineers the task of providing feedback on the author’s code. In this paper, we investigate the task of code review through the lens of equity, the proposition that engineers should share reviewing responsibilities fairly. Through this lens, we quantitatively examine gender inequities in code review load at Google. We found that, on average, women perform about 25% fewer reviews than men, an inequity with multiple systemic antecedents, including authors’ tendency to choose men as reviewers, a recommender system’s amplification of human biases, and gender differences in how reviewer credentials are assigned and earned. Although substantial work remains to close the review load gap, we show how one small change has begun to do so.
View details
What Improves Developer Productivity at Google? Code Quality.
Lan Cheng
Emerson Rex Murphy-Hill
Andrea Marie Knight Dolan
Nan Zhang
Elizabeth Kammer
Foundations of Software Engineering: Industry Paper (2022)
Preview abstract
Understanding what affects software developer productivity can help organizations choose wise investments in their technical and social environment. But the research literature either focuses on what correlates with developer productivity in realistic settings or focuses on what causes developer productivity in highly constrained settings. In this paper, we bridge the gap by studying software developers at Google through two analyses. In the first analysis, we use panel data to understand which of 39 productivity factors affect perceived developer productivity, finding that code quality, tech debt, infrastructure tools and support, team communication, goals and priorities, and organizational change and process are all causally linked to developer productivity. In the second analysis, we use a lagged panel analysis to strengthen our causal claims. We find that increases in perceived code quality tend to be followed by increased developer productivity, but not vice versa, providing the strongest evidence to date that code quality affects individual developer productivity.
View details
Engineering Impacts of Anonymous Author Code Review: A Field Experiment
Emerson Rex Murphy-Hill
Jill Dicker
Lan Cheng
Liz Kammer
Ben Holtz
Andrea Marie Knight Dolan
Transactions on Software Engineering (2021)
Preview abstract
Code review is a powerful technique to ensure high quality software and spread knowledge of best coding practices between engineers. Unfortunately, code reviewers may have biases about authors of the code they are reviewing, which can lead to inequitable experiences and outcomes. In this paper, we describe a field experiment with anonymous author code review, where we withheld author identity information during 5217 code reviews from 300 professional software engineers at one company. Our results suggest that during anonymous author code review, reviewers can frequently guess authors’ identities; that focus is reduced on reviewer-author power dynamics; and that the practice poses a barrier to offline, high-bandwidth conversations. Based on our findings, we recommend that those who choose to implement anonymous author code review should reveal the time zone of the author by default, have a break-the-glass option for revealing author identity, and reveal author identity directly after the review.
View details
Enabling the Study of Software Development Behavior with Cross-Tool Logs
Ben Holtz
Edward K. Smith
Andrea Marie Knight Dolan
Elizabeth Kammer
Jillian Dicker
Caitlin Harrison Sadowski
Lan Cheng
Emerson Murphy-Hill
IEEE Software, Special Issue on Behavioral Science of Software Engineering (2020)
Preview abstract
Understanding developers’ day-to-day behavior can help answer important research questions, but capturing that behavior at scale can be challenging, particularly when developers use many tools in concert to accomplish their tasks. In this paper, we describe our experience creating a system that integrates log data from dozens of development tools at Google, including tools that developers use to email, schedule meetings, ask and answer technical questions, find code, build and test, and review code. The contribution of this article is a technical description of the system, a validation of it, and a demonstration of its usefulness.
View details
Predicting Developers’ Negative Feelings about Code Review
Emerson Murphy-Hill
Elizabeth Kammer
International Conference on Software Engineering (2020)
Preview abstract
During code review, developers critically examine each others’ code to improve its quality, share knowledge, and ensure conformance to coding standards. In the process, developers may have negative interpersonal interactions with their peers, which can lead to frustration and stress; these negative interactions may ultimately result in developers abandoning projects. In this mixed-methods study at one company, we surveyed 1,317 developers to characterize the negative experiences and cross-referenced the results with objective data from code review logs to predict these experiences. Our results suggest that such negative experiences, which we call “pushback”, are relatively rare in practice, but have negative repercussions when they occur. Our metrics can predict feelings of pushback with high recall but low precision, making them potentially appropriate for highlighting interactions that may benefit from a self-intervention.
View details