Software engineering and programming languages

Software engineering and programming language researchers at Google study all aspects of the software development process, from the engineers who make software to the languages and tools that they use.

About the team

We are a collection of teams from across the company who study the problems faced by engineers and invent new technologies to solve those problems. Our teams take a variety of approaches to solve these problems, including empirical methods, interviews, surveys, innovative tools, formal models, predictive machine learning modeling, data science, experiments, and mixed-methods research techniques. As our engineers work within the largest code repository in the world, the solutions need to work at scale, across a team of global engineers and over 2 billion lines of code.

We aim to make an impact internally on Google engineers and externally on the larger ecosystem of software engineers around the world.

Team focus summaries

Developer Tools

Google provides its engineers’ with cutting edge developer tools that operate on codebase with billions of lines of code. The tools are designed to provide engineers with a consistent view of the codebase so they can navigate and edit any project. We research and create new, unique developer tools that allow us to get the benefits of such a large codebase, while still retaining a fast development velocity.

Developer Inclusion and Diversity

We aim to understand diversity and inclusion challenges facing software developers and evaluate interventions that move the needle on creating an inclusive and equitable culture for all.

Developer Productivity

We use both qualitative and quantitative methods to study how to make engineers more productive. Google uses the results of these studies to improve both our internal developer tools and processes and our external offerings for developers on GCP and Android.

Program Analysis and Refactoring

We build static and dynamic analysis tools that find and prevent serious bugs from manifesting in both Google’s and third-party code. We also leverage this large-scale analysis infrastructure to refactor Google’s code at scale.

Machine Learning for Code

We apply deep learning to Google’s large, well-curated codebase to automatically write code and repair bugs.

Programming Language Design and Implementation

We design, evaluate, and implement new features for popular programming languages like Java, C++, and Go through their standards’ processes.

Automated Software Testing and Continuous Integration

We design, implement and evaluate tools and frameworks to automate the testing process and integrate tests with the Google-wide continuous integration infrastructure.

Featured publications

Enabling the Study of Software Development Behavior with Cross-Tool Logs

Ciera Jaspan

Matthew Jorde

Carolyn Denomme Egelman

Collin Green

Ben Holtz

Edward K. Smith

Maggie Morrow Hodges

Andrea Marie Knight Dolan

Elizabeth Kammer

Jillian Dicker

Caitlin Harrison Sadowski

James Lin

Lan Cheng

Mark Canning

Emerson Murphy-Hill

IEEE Software, Special Issue on Behavioral Science of Software Engineering (2020)

What Predicts Software Developers’ Productivity?

Emerson Murphy-Hill

Ciera Jaspan

Caitlin Sadowski

David C. Shepherd

Michael Phillips

Collin Winter

Andrea Knight Dolan

Edward K. Smith

Matthew A. Jorde

Transactions on Software Engineering (2019)

FUDGE: Fuzz Driver Generation at Scale

Domagoj Babic

Stefan Bucur

Yaohui Chen

Franjo Ivancic

Tim King

Markus Kusano

Caroline Lemieux

László Szekeres

Wei Wang

Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ACM

DeepDelta: Learning to Repair Compilation Errors

Ali Mesbah

Andrew Rice

Emily Johnston

Nick Glorioso

Eddie Aftandilian

Proceedings of the 2019 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) (2019)

State of Mutation Testing at Google

Goran Petrovic

Marko Ivankovic

Proceedings of the 40th International Conference on Software Engineering 2017 (SEIP) (2018) (to appear)

Engineering Impacts of Anonymous Author Code Review: A Field Experiment

Emerson Rex Murphy-Hill

Jill Dicker

Maggie Hodges

Carolyn Denomme Egelman

Ciera Nicole Christopher Jaspan

Lan Cheng

Liz Kammer

Ben Holtz

Matthew A. Jorde

Andrea Marie Knight Dolan

Collin Green

Transactions on Software Engineering (2021)

Who Broke the Build? Automatically Identifying Changes That Induce Test Failures In Continuous Integration at Google Scale

Celal Ziftci

Jim Reardon

Proceedings of the 39th International Conference on Software Engineering: Software Engineering in Practice Track, IEEE Press, Buenos Aires, Argentina (2017), pp. 113-122 (to appear)