Jump to Content
Emerson Murphy-Hill

Emerson Murphy-Hill

Emerson is a Staff Research Scientist and a member of Google's Product Inclusion & Equity and Engineering Productivity Research teams. His research spans software engineering and human-computer interaction, winning an NSF CAREER Award in 2013 and five ACM SIGSOFT Distinguished Paper Awards. He's currently an associate editor for Empirical Software Engineering, and on several program committees, including the 2023 International Conference on Software Engineering. Previously, he was an Associate Professor at North Carolina State University where he directed the Developer Liberation Front. You can reach him at emersonm@google.com.
Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Systemic Gender Inequities in Who Reviews Code
    Jill Dicker
    Amber Horvath
    Laurie R. Weingart
    Nina Chen
    Computer Supported Cooperative Work (2023) (to appear)
    Preview abstract Code review is an essential task for modern software engineers, where the author of a code change assigns other engineers the task of providing feedback on the author’s code. In this paper, we investigate the task of code review through the lens of equity, the proposition that engineers should share reviewing responsibilities fairly. Through this lens, we quantitatively examine gender inequities in code review load at Google. We found that, on average, women perform about 25% fewer reviews than men, an inequity with multiple systemic antecedents, including authors’ tendency to choose men as reviewers, a recommender system’s amplification of human biases, and gender differences in how reviewer credentials are assigned and earned. Although substantial work remains to close the review load gap, we show how one small change has begun to do so. View details
    Building and Sustaining Ethnically, Racially, and Gender Diverse Software Engineering Teams: A Study at Google
    Ella Dagan
    Anita Sarma
    Alison Chang
    Jill Dicker
    The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) (2023) (to appear)
    Preview abstract Teams that build software are largely demographically homogeneous. Without diversity, homogeneous perspectives dominate how, why, and for whom software is designed. To understand how teams can successfully build and sustain diversity, we interviewed 11 engineers and 9 managers from some of the most gender and racially diverse teams at Google, a large software company. Qualitatively analyzing the interviews, we found shared approaches to recruiting, hiring, and promoting an inclusive environment, all of which create a positive feedback loop. Our findings produce actionable practices that every member of the team can take to increase diversity by fostering a more inclusive software engineering environment. View details
    Preview abstract Code review is a common practice in software organizations, where software engineers give each other feedback about a code change. As in other human decision-making processes, code review is susceptible to human biases, where reviewers’ feedback to the author may depend on how reviewers perceive the author’s demographic identity, whether consciously or unconsciously. Through the lens of role congruity theory, we show that the amount of pushback that code authors receive varies based on their gender, race/ethnicity, and age. Furthermore, we estimate that such pushback costs Google more than 1000 extra engineer hours every day, or about 4% of the estimated time engineers spend responding to reviewer comments, a cost borne by non-White and non-male engineers. View details
    Destructive Criticism in Software Code Review Impacts Inclusion
    Sanuri Dananja Gunawardena
    Peter Devine
    Isabelle Beaumont
    Lola Garden
    Kelly Blincoe
    Computer Supported Cooperative Work (2022)
    Preview abstract The software industry lacks gender diversity. Recent research has suggested that a toxic working culture is to blame. Studies have found that communications in software repositories directed towards women are more negative in general. In this study, we use a destructive criticism lens to examine gender differences in software code review feedback. Software code review is a practice where code is peer reviewed and negative feedback is often delivered. We explore differences in perceptions, frequency, and impact of destructive criticism across genders. We surveyed 93 software practitioners eliciting perceived reactions to hypothetical scenarios (or vignettes) where participants are asked to imagine receiving either constructive or destructive criticism. In addition, the survey collected general opinions on feedback obtained during software code review as well as the frequency that participants give and receive destructive criticism. We found that opinions on destructive criticism vary. Women perceive destructive criticism as less appropriate and are less motivated to continue working with the developer after receiving destructive criticism. Destructive criticism is fairly common with more than half of respondents having received nonspecific negative feedback and nearly a quarter having received inconsiderate negative feedback in the past year. Our results suggest that destructive criticism in code review could be a contributing factor to the lack of gender diversity observed in the software industry. View details
    Detecting Interpersonal Conflict in Issues and Code Review: Cross Pollinating Open- and Closed-Source Approaches
    Huilian Sophie Qiu
    Bogdan Vasilescu
    Christian Kästner
    International Conference on Software Engineering: Software Engineering on Society (2022)
    Preview abstract In software engineering, interpersonal conflict in code review, such as toxic language or an unnecessary pushback on a change request, is a well-known and extensively studied problem because it is associated with negative outcomes, such as stress and turnover. One effective approach to prevent and mitigate toxic language is to develop automatic detection. Two most-recent attempts on automatic detection were developed under different settings: a toxicity detector using text analytics for open source issue discussions and a pushback detector using logs-based metrics for corporate code reviews. While these settings are arguably distinct, the behaviors that they can capture share similarities. Our work studies how the toxicity detector and the pushback detector can be generalized beyond the respective contexts in which they were developed and how the combination of the two can improve interpersonal conflict detection. This research has implications for designing interventions and offers an opportunity to apply a technique to both open and closed source software, possibly benefiting from synergies, a rarity in software engineering research, in our experience. View details
    What Improves Developer Productivity at Google? Code Quality.
    Lan Cheng
    Andrea Marie Knight Dolan
    Nan Zhang
    Elizabeth Kammer
    Foundations of Software Engineering: Industry Paper (2022)
    Preview abstract Understanding what affects software developer productivity can help organizations choose wise investments in their technical and social environment. But the research literature either focuses on what correlates with developer productivity in realistic settings or focuses on what causes developer productivity in highly constrained settings. In this paper, we bridge the gap by studying software developers at Google through two analyses. In the first analysis, we use panel data to understand which of 39 productivity factors affect perceived developer productivity, finding that code quality, tech debt, infrastructure tools and support, team communication, goals and priorities, and organizational change and process are all causally linked to developer productivity. In the second analysis, we use a lagged panel analysis to strengthen our causal claims. We find that increases in perceived code quality tend to be followed by increased developer productivity, but not vice versa, providing the strongest evidence to date that code quality affects individual developer productivity. View details
    Preview abstract Code review is a powerful technique to ensure high quality software and spread knowledge of best coding practices between engineers. Unfortunately, code reviewers may have biases about authors of the code they are reviewing, which can lead to inequitable experiences and outcomes. In this paper, we describe a field experiment with anonymous author code review, where we withheld author identity information during 5217 code reviews from 300 professional software engineers at one company. Our results suggest that during anonymous author code review, reviewers can frequently guess authors’ identities; that focus is reduced on reviewer-author power dynamics; and that the practice poses a barrier to offline, high-bandwidth conversations. Based on our findings, we recommend that those who choose to implement anonymous author code review should reveal the time zone of the author by default, have a break-the-glass option for revealing author identity, and reveal author identity directly after the review. View details
    Preview abstract During code review, developers critically examine each others’ code to improve its quality, share knowledge, and ensure conformance to coding standards. In the process, developers may have negative interpersonal interactions with their peers, which can lead to frustration and stress; these negative interactions may ultimately result in developers abandoning projects. In this mixed-methods study at one company, we surveyed 1,317 developers to characterize the negative experiences and cross-referenced the results with objective data from code review logs to predict these experiences. Our results suggest that such negative experiences, which we call “pushback”, are relatively rare in practice, but have negative repercussions when they occur. Our metrics can predict feelings of pushback with high recall but low precision, making them potentially appropriate for highlighting interactions that may benefit from a self-intervention. View details
    Preview abstract Static analysis tools can help prevent security incidents, but to do so, they must enable developers to resolve the defects they detect. Unfortunately, developers often struggle to interact with the interfaces of these tools, leading to tool abandonment, and consequently the proliferation of preventable vulnerabilities. Simply put, the usability of static analysis tools is crucial. The usable security community has successfully identified and remedied usability issues in end user security applications, like PGP and Tor browsers, by conducting usability evaluations. Inspired by the success of these studies, we conducted a heuristic walkthrough evaluation and user study focused on four security-oriented static analysis tools. Through the lens of these evaluations, we identify several issues that detract from the usability of static analysis tools. The issues we identified range from workflows that do not support developers to interface features that do not scale. We make these findings actionable by outlining how our results can be used to improve the state-of-the-art in static analysis tool interfaces. View details
    Enabling the Study of Software Development Behavior with Cross-Tool Logs
    Ben Holtz
    Edward K. Smith
    Andrea Marie Knight Dolan
    Elizabeth Kammer
    Jillian Dicker
    Lan Cheng
    IEEE Software, vol. Special Issue on Behavioral Science of Software Engineering (2020)
    Preview abstract Understanding developers’ day-to-day behavior can help answer important research questions, but capturing that behavior at scale can be challenging, particularly when developers use many tools in concert to accomplish their tasks. In this paper, we describe our experience creating a system that integrates log data from dozens of development tools at Google, including tools that developers use to email, schedule meetings, ask and answer technical questions, find code, build and test, and review code. The contribution of this article is a technical description of the system, a validation of it, and a demonstration of its usefulness. View details
    Preview abstract In modern data analytics, practices from software development are increasingly necessary to manage data,but they must be incorporated alongside other statistical and scientific skills. Therefore, we ask: how does a community recontextualize and reinterpret software development through the unique pressures of their work? To answer this, we explore the data-centric community around baseball analytics, or sabermetrics. To discover development’s place in the search for robust statistical insight in sports, we interview 10 participants in the sabermetric community and survey over 120 more data analysts, both in baseball and not. We explore how their work lives at the intersection of science and entertainment, and as a consequence, baseball data serves as an accessible yet deep subject to practice analytic skills. Software development exists within an iterative research process that cycles between defining rigorous statistical methods and preserving the flexibility to chase interesting problems. In this question-driven process, members of the community inhabit several overlapping roles of intentional work, in which software development can become the priority to support research and statistical infrastructure, and we discuss the way that the community can foster the balance of these skills View details
    Investigating the Effects of Gender Bias on GitHub
    Nasif Imtiaz
    Justin Middleton
    Joymallya Chakraborty
    Neill Robson
    Gina Bai
    Proceedings of the 2019 International Conference on Software Engineering
    Preview abstract Diversity, including gender diversity, is valued by many software development organizations, yet the field remains dominated by men. One reason for this lack of diversity is gender bias. In this paper, we study the effects of that bias by using an existing framework derived from the gender studies literature. We adapt the four main effects proposed in the framework by posing hypotheses about how they might manifest on GitHub, then evaluate those hypotheses quantitatively. While our results show that effects of gender bias are largely invisible on the GitHub platform itself, there are still signals of women concentrating their work in fewer places and being more restrained in communication than men. View details
    Do Developers Learn New Tools On The Toilet?
    Edward K. Smith
    Andrea Knight Dolan
    Andrew Trenk
    Steve Gross
    Proceedings of the 2019 International Conference on Software Engineering
    Preview abstract Maintaining awareness of useful tools is a substantial challenge for developers. Physical newsletters are a simple technique to inform developers about tools. In this paper, we evaluate such a technique, called Testing on the Toilet, by performing a mixed-methods case study. We first quantitatively evaluate how effective this technique is by applying statistical causal inference over six years of data about tools used by thousands of developers. We then qualitatively contextualize these results by interviewing and surveying 382 developers, from authors to editors to readers. We found that the technique was generally effective at increasing software development tool use, although the increase varied depending on factors such as the breadth of applicability of the tool, the extent to which the tool has reached saturation, and the memorability of the tool name. View details
    What Predicts Software Developers’ Productivity?
    David C. Shepherd
    Michael Phillips
    Andrea Knight Dolan
    Edward K. Smith
    Transactions on Software Engineering (2019)
    Preview abstract Organizations have a variety of options to help their software developers become their most productive selves, from modifying office layouts, to investing in better tools, to cleaning up the source code. But which options will have the biggest impact? Drawing from the literature in software engineering and industrial/organizational psychology to identify factors that correlate with productivity, we designed a survey that asked 622 developers across 3 companies about these productivity factors and about self-rated productivity. Our results suggest that the factors that most strongly correlate with self-rated productivity were non-technical factors, such as job enthusiasm, peer support for new ideas, and receiving useful feedback about job performance. Compared to other knowledge workers, our results also suggest that software developers’ self-rated productivity is more strongly related to task variety and ability to work remotely. View details
    Preview abstract Software developers’ productivity can be negatively impacted by using APIs incorrectly. In this paper, we describe an analysis technique we designed to find API usability problems by comparing successive file-level changes made by individual software developers. We applied our tool, StopMotion, to the file histories of real developers doing real tasks at Google. The results reveal several API usability challenges including simple typos, conceptual API misalignments, and conflation of similar APIs. View details
    Advantages and Disadvantages of a Monolithic Codebase
    Andrea Knight
    Edward K. Smith
    International Conference on Software Engineering, Software Engineering in Practice track (ICSE SEIP) (2018)
    Preview abstract Monolithic source code repositories (repos) are used by several large tech companies, but little is known about their advantages or disadvantages compared to multiple per-project repos. This paper investigates the relative tradeoffs by utilizing a mixed-methods approach. Our primary contribution is a survey of engineers who have experience with both monolithic repos and multiple, per-project repos. This paper also backs up the claims made by these engineers with a large-scale analysis of developer tool logs. Our study finds that the visibility of the codebase is a significant advantage of a monolithic repo: it enables engineers to discover APIs to reuse, find examples for using an API, and automatically have dependent code updated as an API migrates to a new version. Engineers also appreciate the centralization of dependency management in the repo. In contrast, multiple-repository (multi-repo) systems afford engineers more flexibility to select their own toolchains and provide significant access control and stability benefits. In both cases, the related tooling is also a significant factor; engineers favor particular tools and are drawn to repo management systems that support their desired toolchain. View details
    No Results Found