Jeffrey Dean

Jeffrey Dean

I joined Google in mid-1999, and I'm currently Google's Chief Scientist, focusing on AI advances for Google DeepMind and Google Research. My areas of focus include machine learning and AI and applications of AI to problems that help billions of people in societally beneficial ways. I have a broad variety of interests, including machine learning, large-scale distributed systems, computer systems performance, compression techniques, information retrieval, application of machine learning to search and other related problems, microprocessor architecture, compiler optimizations, and the development of new products that organize information in new and interesting ways. My Google Scholar page has a complete list of research papers I have co-authored.

In 2011, I co-founded the Google Brain project/team, focused on making progress towards intelligent machines. Since then, my individual work has focused on research, systems and applications for AI and ML, as well as steering the direction of our broader AI/ML and computer science research community. For the past few years, I’ve had the great pleasure to write a blog post early each year summarizing many pieces of the public work done by amazing colleagues and researchers over the previous year in our research teams (despite the similar-sounding titles, these annual blog posts are each quite different!).

A (possibly out of date) resume is here.

Some of the areas I’ve worked on in AI and ML (generally with many collaborators!) include:

  • Research leadership. Steering the research directions of the Google Brain team, Google Research, and now Google DeepMind (with many others!). See year-end blog post links above for more details about this, which includes advances in things like the Transformer architecture, machine learning systems (DistBelief, TensorFlow, Pathways), TPUs, the Inception model, word2vec, seq2seq models, neural machine translation, distillation, neural architecture search/AutoML, RankBrain, BERT, TensorFlow, JAX, Pathways, PaLM, PaLM 2, PaLI, PaLM-E, MedPalm, NeRF, quantum computing advances, ML for chip design, computational photography (e.g. Night Sight & Magic Eraser), flood forecasting, Responsible AI research areas like bias, fairness and interpretability, medical diagnostics, auction theory, open source software and datasets, accessibility, weather forecasting, ML for robotics, connectomics, genomics, and more, as well as research impact in products across nearly all of Google, including Search, Ads, YouTube, GMail, Workspace, Maps, News, Photos, Translate, Android, Cloud, Pixel, Waymo, and many more products.

  • Computer systems for ML. The design and implementation of three generations of systems for training and deploying of deep learning models: DistBelief, TensorFlow, and Pathways.

    In DistBelief, we explored large-scale, highly distributed systems and asynchronous training algorithms to enable ML models to be trained on large amounts of data, even on the relatively slow, non-ML-optimized hardware of the time (we trained models with 2B non-embedding parameters at a time when the largest models reported in the literature were 10M to 50M parameters). The system was used for hundreds of projects within Google and had widespread use across many Google products. Some of the earliest research work we did using DistBelief was exploring unsupervised learning on video frames to see what sorts of representations would emerge, in Building high-level features using large scale unsupervised learning, a.k.a "the cat neuron paper". We also used DistBelief to develop word2vec, various speech recognition models, multimodal work like DeViSE, and early embedding models like RankBrain.

    TensorFlow: I was one of the primary designers and implementors of the initial TensorFlow system. I made the case that we should open-source Tensorflow, and we released it as an open source project in 2015, hosted on GitHub. It is used by millions of researchers and developers all over the world for exploring and creating ML and AI systems on platforms ranging from tiny embedded systems, to phones, desktop computers, and ML supercomputers. For detailed papers on TensorFlow, see Tensorflow: Large-scale machine learning on heterogeneous distributed systems (white paper) and TensorFlow: A System for Large-Scale Machine Learning (OSDI 2016).

    Pathways is designed to support large-scale, multimodal, sparse architectures that are capable of solving thousands or millions of tasks. I was one of the original designers and implementers, and a paper about the systems research aspects of Pathways appeared in MLSys 2022 as Pathways: Asynchronous Distributed Dataflow for ML. The underlying system software has been used for work like the PaLM language models (which underlie work like Med-PaLM, PaLM-E for robotics), PaLI, and other downstream uses.

  • Language modeling. I have worked on many different projects related to language modeling, starting with work in 2007 that trained 300 billion parameter language models on trillions of tokens of text (Large language models in machine translation), demonstrating significant improvements in translation quality.

    I was a co-author on a pair of papers that introduced an approach of learning distributed representations of words that is now commonly called word2vec (Efficient estimation of word representations in vector space and Distributed representations of words and phrases and their compositionality).

    I was one of many who helped to convert the Google Translate system over to using a neural machine translation system, with further significant gains to translation quality. See Google’s neural machine translation system: Bridging the gap between human and machine translation (2016) and Google’s multilingual neural machine translation system: Enabling zero-shot translation. Gideon Lewis-Kraus of The NY Times magazine wrote an in-depth feature about the rollout of the neural machine translation system in Google Translate in The Great AI Awakening.

    Part of the infrastructure work on Pathways is designed to enable scaling training of larger models on larger and more diverse datasets. I worked on the PaLM language model work, and I am one of the co-leads of the Gemini effort, which is building next-generation multimodal models that can use tools and APIs to enable more capable models that can be used in a variety of Google products and application areas.

  • Distillation. I am one of the co-creators of a machine learning technique called distillation, a now-widely-used approach for transferring the knowledge from one neural network to another. It is often used to create smaller, much more efficient models for inference from larger, more unwieldy models, and it can also be used to transfer knowledge from one neural network architecture to a completely different architecture. See Distilling the Knowledge in a Neural Network.

  • Sparse models. I have been involved in a series of work on sparse model architectures for neural networks, including Outrageously large neural networks: The sparsely-gated mixture-of-experts layer (2017) and Designing Effective Sparse Expert Models. A review of approaches for sparse models appears in A Review of Sparse Expert Models in Deep Learning.

  • AI for ASIC chip design. I have worked on research on how to apply reinforcement learning to the problem of placement and routing in ASIC chip design. We have shown that it is possible to get performance that is as good or better than human performance on the problem of chip floorplanning in a system that runs in a few hours. Our work here was published in Nature and has been used for multiple generations of Google’s TPU ML accelerators.

  • ML for healthcare. I have worked on the use of AI and machine learning in healthcare settings. We have done work showing that machine learning on deidentified medical records can produce useful and actionable suggestions for clinicians, published as Scalable and Accurate Deep Learning with Electronic Health Records. The broader research community at Google has also done work on applying machine learning across many different problems in health, including medical imaging diagnostics, genomics, medical note transcription and summarization, and novel sensing (see health sections of year-in-review blog posts above). I’ve also collaborated on a couple of review articles in this space. One assessed some of the most promising directions for integrating deep learning into healthcare settings, and was published in Nature Medicine as A Guide to Deep Learning in Healthcare. The other was a NEJM article titled Machine Learning in Medicine.

  • ML for computer systems. I have worked with many others on advancing the use of machine learning for tackling computer systems problems. Among these are device placement using reinforcement learning to map abstract ML computation graphs onto a set of physical devices in order to give the best performance (and some follow-on work on a hierarchical version of this), and the use of learned index structures in database systems instead of traditional data structures like B-trees and hash tables.

  • Energy efficiency of machine learning. I have helped push forward Google’s TPU efforts, identifying fairly early in the widespread use of deep learning that creating efficient systems was going to require building customized accelerator hardware, leading to a long line of TPU processors. TPUv1 (In-datacenter Performance Analysis of a Tensor Processing Unit) targeted inference computations and was about 30X - 80X better performance/Watt than contemporary CPUs and GPUs. Subsequent TPU generations target both training and inference in large-scale ML accelerator systems and are crucial to much of the machine learning research and product applications of ML at Google. They are available to external entities as Google Cloud TPUs.

    Carbon emissions of machine learning training is an area that is rife with misinformation due to the prevalence of flawed and inaccurate estimates, so I have also worked with others to correct some of this misinformation and put actual measured data into the literature. See Carbon emissions and large neural network training, especially appendices C and D, and The carbon footprint of machine learning training will plateau, then shrink (if ML researchers adopt best practices). I gave a talk on some of these issues at the 2022 MIT Climate Impacts of Computing and Communications workshop.

While at Google, I've also worked on the following:
  • Google Search. The design and implementation of five generations of our crawling, indexing, and query serving systems, covering two and three orders of magnitude growth in number of documents searched, number of queries handled per second, and frequency of updates to the system. We did not publish research papers on most aspects of this, but I gave a talk at WSDM'09 about some of the issues involved in building large-scale retrieval systems (slides).
  • Search ranking algorithms. Some aspects of our search ranking algorithms, notably improved handling for dealing with off-page signals such as anchortext.
  • Search ranking prototyping system. The design and implementation of prototyping infrastructure for rapid development and experimentation with new ranking algorithms.
  • MapReduce. The design and implementation of MapReduce, a system for simplifying the development of large-scale data processing applications. A paper about MapReduce appeared in OSDI'04. MapReduce is used extensively within Google, and provided the inspiration for external open-source projects like Hadoop, as well as follow-on projects like Flume.

  • BigTable. The design and implementation of BigTable, a large-scale semi-structured storage system used underneath a number of Google products. A paper about BigTable appeared in OSDI'06. BigTable is used by hundreds of teams at Google and sits underneath dozens of products. It is available externally as Cloud Bigtable. As of 2023, BigTable processes more than 6 billion requests per second at peak and has over 10 exabytes of data under management.

  • Spanner. The design and implementation of Spanner, a geographically-distributed worldwide storage system that can provide strong consistency guarantees through the use of Paxos and highly synchronized clocks in multiple data centers. A paper about Spanner appeared in OSDI’12. Spanner is used extensively for hundreds of projects within Google, underlies a large fraction of our products, and is available for external uses as Google’s Cloud Spanner product.

  • Google Ads. I was part of a group of three people who did the design and implementation of the initial version of Google's advertising serving system.
  • AdSense. The initial development of Google's AdSense for Content product (involving both the production serving system design and implementation as well as work on developing and improving the quality of ad selection based on the contents of pages).
  • Protocol buffers. The development of Protocol Buffers, a way of encoding structured data in an efficient yet extensible format, and a compiler that generates convenient wrappers for manipulating the objects in a variety of languages. Protocol Buffers are used extensively at Google for almost all RPC protocols, and for storing structured information in a variety of persistent storage systems. A version of the protocol buffer implementation has been open-sourced and is available at https://github.com/protocolbuffers/protobuf/, and a developer site with documentation and more details is at https://protobuf.dev/.
  • Google News. Some of the initial production serving system work for the Google News product, working with Krishna Bharat to move the prototype system he put together into a deployed system.

  • Job scheduling system. The design and implementation of the first generation of our automated job scheduling system for managing a cluster of machines.
  • Timeseries analysis system. The initial design and implementation of a system for analyzing complex timeseries data. This system is used extensively by dozens of Google teams to support various use cases like suggested completions, recommendations, etc. The system is available for Cloud customers to analyze their own datasets via the Timeseries Insights API.

  • Google Translate. Some of the production system design for Google Translate, our statistical machine translation system. In particular, I designed and implemented a system for distributed high-speed access to very large language models (too large to fit in memory on a single machine), and then later helped with the transition to using neural machine translation models.
  • LevelDB. The design and implementation of LevelDB, a high performance key-value store that we released as an open-source project. It is used in a wide variety of projects including Google Chrome.

  • Code search. Some internal tools to make it easy to rapidly search our internal source code repository. Many of the ideas from this internal tool were incorporated into our Google Code Search product, including the ability to use regular expressions for searching large corpora of source code.
I enjoy developing software with great colleagues, and I've been fortunate to have worked with many wonderful and talented people on all of my work here at Google. To help ensure that Google continues to hire people with excellent technical skills, I've also been fairly involved in our engineering hiring process.

I received a Ph.D. in computer science from the University of Washington in 1996, working on compiler optimizations for object-oriented languages advised by Craig Chambers. I received a B.S. in computer science and economics (summa cum laude) from the University of Minnesota in 1990 (doing honors theses on parallel training of neural networks and the economic impact of HIV/AIDS).

From 1996 to 1999, I worked for Digital Equipment Corporation's Western Research Lab in Palo Alto, where I worked on low-overhead profiling tools, design of profiling hardware for out-of-order microprocessors, and web-based information retrieval. From 1990 to 1991, I worked for the World Health Organization's Global Programme on AIDS, developing software to do statistical modeling, forecasting, and analysis of the HIV pandemic. In high school and during the summers in college, I worked first at the Centers for Disease Control and later at the World Health Organization developing a series of versions of software called Epi Info (wikipedia) for analyzing epidemiological data (still one of my most cited works).

In 2009, I was elected to the National Academy of Engineering, and in 2016, I was elected as a member of the American Academy of Arts and Sciences. I was also named a Fellow of the Association for Computing Machinery (ACM) and a Fellow of the American Association for the Advancement of Sciences (AAAS). I am a recipient of the ACM Prize in Computing (2012, with my long-time colleague Sanjay Ghemawat), the IEEE John von Neumann medal (video), and the Mark Weiser Award.

James Somers of the New Yorker wrote a delightful article in 2018 about me and my long-time collaborator Sanjay Ghemawat and how we work together: The Friendship That Made Google Huge.

Selected slides/talks:

Note that talks with similar titles sometimes end up having different mixes of content.

Some of the papers I’ve co-authored with awesome colleagues have been fortunate enough to win various awards:
  • NeurIPS 2023 Test of Time Award (for Distributed Representations of Words and Phrases and their Compositionality published at NeurIPS 2013)
  • Outstanding Paper Award, MLSys 2022 (for Pathways: Asynchronous Distributed Dataflow for ML)
  • SIGOPS Hall of Fame Award, 2022 (for Spanner: Google’s Globally Distributed Database System at OSDI 2012)
  • Best Paper Award, EuroSys 2018 (for Dynamic Control Flow in Large-Scale Machine Learning)
  • SIGOPS Hall of Fame Award, 2016 (for Bigtable: A Distributed Storage System for Structured Data)
  • SIGOPS Hall of Fame Award, 2015 (for MapReduce: Simplified Data Processing on Large Clusters)
  • Best Paper Award, OSDI 2012 (for Spanner: Google’s Globally Distributed Database System)
  • 10-year Retrospective Most Influential Paper Award from OOPSLA 2007 (for Call Graph Construction in Object-Oriented Languages, 1997).
  • Best Paper Award, OSDI 2006 (for Bigtable: A Distributed Storage System for Structured Data)
  • 10-year Retrospective Most Influential Paper Award from PLDI 2005 (for Selective Specialization for Object-Oriented Languages, 1995)
  • Best Paper Award, SOSP 1997 (for Continuous Profiling: Where Have All the Cycles Gone?)

Personal:

I've lived in lots of places in my life: Honolulu, HI; Manila, The Phillipines; Boston, MA; West Nile District, Uganda; Boston (again); Little Rock, AR; Hawaii (again); Minneapolis, MN; Mogadishu, Somalia; Atlanta, GA; Minneapolis (again); Geneva, Switzerland; Seattle, WA; and (currently) Palo Alto, CA. I'm hard-pressed to pick a favorite, though: each place has its plusses and minuses.

One of my life goals is to play soccer and basketball on every continent. So far, I've done so in North America, South America, Europe, Asia, Oceania, and Africa. I'm worried that Antarctica might be tough, though.

Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract We present the design of a new large scale orchestration layer for accelerators. Our system, Pathways, is explicitly designed to enable exploration of new systems and ML research ideas, while retaining state of the art performance for current models. Pathways uses a sharded dataflow graph of asynchronous operators that consume and produce futures, and efficiently gang-schedules heterogeneous parallel computations on thousands of accelerators while coordinating data transfers over their dedicated interconnects. Pathways makes use of a novel asynchronous distributed dataflow design that lets the control plane execute in parallel despite dependencies in the data plane. This design, with careful engineering, allows Pathways to adopt a single-controller model that makes it easier to express complex new parallelism patterns. We demonstrate that Pathways can achieve performance parity (~100% accelerator utilization) with state-of-the-art systems when running SPMD computations over 2048 TPUs, while also delivering throughput comparable to the SPMD case for Transformer models that are pipelined across 16 stages, or sharded across two islands of accelerators connected over a data center network. View details
    Dynamic Control Flow in Large-Scale Machine Learning
    Yuan Yu
    Eugene Brevdo
    Mike Burrows
    Tim Harley
    Peter Hawkins
    Manjunath Kudlur
    Rajat Monga
    Xiaoqiang Zheng
    Proceedings of EuroSys 2018
    Preview abstract Many recent machine learning models rely on fine-grained dynamic control flow for training and inference. In particular, models based on recurrent neural networks and on reinforcement learning depend on recurrence relations, data-dependent conditional execution, and other features that call for dynamic control flow. These applications benefit from the ability to make rapid control-flow decisions across a set of computing devices in a distributed system. For performance, scalability, and expressiveness, a machine learning system must support dynamic control flow in distributed and heterogeneous environments. This paper presents a programming model for distributed machine learning that supports dynamic control flow. We describe the design of the programming model, and its implementation in TensorFlow, a distributed machine learning system. Our approach extends the use of dataflow graphs to represent machine learning models, offering several distinctive features. First, the branches of conditionals and bodies of loops can be partitioned across many machines to run on a set of heterogeneous devices, including CPUs, GPUs, and custom ASICs. Second, programs written in our model support automatic differentiation and distributed gradient computations, which are necessary for training machine learning models that use control flow. Third, our choice of non-strict semantics enables multiple loop iterations to execute in parallel across machines, and to overlap compute and I/O operations. We have done our work in the context of TensorFlow, and it has been used extensively in research and production. We evaluate it using several real-world applications, and demonstrate its performance and scalability. View details
    TensorFlow: A system for large-scale machine learning
    Jianmin Chen
    Matthieu Devin
    Geoffrey Irving
    Manjunath Kudlur
    Rajat Monga
    Benoit Steiner
    Paul Tucker
    Vijay Vasudevan
    Pete Warden
    Yuan Yu
    Xiaoqiang Zheng
    12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), USENIX Association (2016), pp. 265-283
    Preview abstract TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. TensorFlow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs, and custom-designed ASICs known as Tensor Processing Units (TPUs). This architecture gives flexibility to the application developer: whereas in previous “parameter server” designs the management of shared state is built into the system, TensorFlow enables developers to experiment with novel optimizations and training algorithms. TensorFlow supports a variety of applications, with a focus on training and inference on deep neural networks. Several Google services use TensorFlow in production, we have released it as an open-source project, and it has become widely used for machine learning research. In this paper, we describe the TensorFlow dataflow model and demonstrate the compelling performance that Tensor- Flow achieves for several real-world applications. View details
    TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
    Ashish Agarwal
    Eugene Brevdo
    Craig Citro
    Matthieu Devin
    Ian Goodfellow
    Andrew Harp
    Geoffrey Irving
    Yangqing Jia
    Rafal Jozefowicz
    Lukasz Kaiser
    Manjunath Kudlur
    Dan Mané
    Rajat Monga
    Chris Olah
    Mike Schuster
    Jonathon Shlens
    Benoit Steiner
    Ilya Sutskever
    Kunal Talwar
    Paul Tucker
    Vijay Vasudevan
    Pete Warden
    Yuan Yu
    Xiaoqiang Zheng
    tensorflow.org (2015)
    Preview abstract TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards. The system is flexible and can be used to express a wide variety of algorithms, including training and inference algorithms for deep neural network models, and it has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery. This paper describes the TensorFlow interface and an implementation of that interface that we have built at Google. The TensorFlow API and a reference implementation were released as an open-source package under the Apache 2.0 license in November, 2015 and are available at www.tensorflow.org. View details
    The Tail at Scale
    Luiz André Barroso
    Communications of the ACM, 56 (2013), pp. 74-80
    Preview abstract Systems that respond to user actions very quickly (within 100 milliseconds) feel more fluid and natural to users than those that take longer [Card et al 1991]. Improvements in Internet connectivity and the rise of warehouse-scale computing systems [Barroso & Hoelzle 2009] have enabled Web services that provide fluid responsiveness while consulting multi-terabyte datasets that span thousands of servers. For example, the Google search system now updates query results interactively as the user types, predicting the most likely query based on the prefix typed so far, performing the search, and showing the results within a few tens of milliseconds. Emerging augmented reality devices such as the Google Glass prototype will need associated Web services with even greater computational needs while guaranteeing seamless interactivity. It is challenging to keep the tail of the latency distribution low for interactive services as the size and complexity of the system scales up or as overall utilization increases. Temporary high latency episodes which are unimportant in moderate size systems may come to dominate overall service performance at large scale. Just as fault-tolerant computing aims to create a reliable whole out of less reliable parts, we suggest that large online services need to create a predictably responsive whole out of less predictable parts. We refer to such systems as latency tail-tolerant, or tail-tolerant for brevity. This article outlines some of the common causes of high latency episodes in large online services and describes techniques that reduce their severity or mitigate their impact in whole system performance. In many cases, tail-tolerant techniques can take advantage of resources already deployed to achieve fault-tolerance, resulting in low additional overheads. We show that these techniques allow system utilization to be driven higher without lengthening the latency tail, avoiding wasteful over-provisioning. View details
    Spanner: Google's Globally Distributed Database
    Michael Epstein
    Andrew Fikes
    Christopher Frost
    J. J. Furman
    Andrey Gubarev
    Christopher Heiser
    Sebastian Kanthak
    Eugene Kogan
    Hongyi Li
    Sergey Melnik
    David Mwaura
    David Nagle
    Rajesh Rao
    Lindsay Rolig
    Yasushi Saito
    Michal Szymaniak
    Christopher Taylor
    Ruth Wang
    Dale Woodford
    ACM Trans. Comput. Syst., 31 (2013), pp. 8
    Preview
    Achieving Rapid Response Times in Large Online Services
    Talk given at Berkeley AMPLab Cloud Seminar, March 26, 2012 (2012)
    Preview abstract Today’s large-scale web services provide rapid responses to interactive requests by applying large amounts of computational resources to massive datasets. They typically operate in warehouse-sized datacenters and run on clusters of machines that are shared across many kinds of interactive and batch jobs. As these systems distribute work to ever larger numbers of machines and sub-systems in order to provide interactive response times, it becomes increasingly difficult to tightly control latency variability across these machines, and often the 95%ile and 99%ile response times suffer in an effort to improve average response times. As systems scale up, simply stamping out all sources of variability does not work. Just as fault-tolerant techniques needed to be developed when guaranteeing fault-free operation by design became unfeasible, techniques that deliver predictably low service-level latency in the presence of highly-variable individual components are increasingly important at larger scales. In this talk, I’ll describe a collection of techniques and practices lowering response times in large distributed systems whose components run on shared clusters of machines, where pieces of these systems are subject to interference by other tasks, and where unpredictable latency hiccups are the norm, not the exception. Some of the techniques adapt to trends observed over periods of a few minutes, making them effective at dealing with longer-lived interference or resource contention. Others react to latency anomalies within a few milliseconds, making them suitable for mitigating variability within the context of a single interactive request. I’ll discuss examples of how these techniques are used in various pieces of Google’s systems infrastructure and in various higher-level online services. View details
    Spanner: Google's Globally-Distributed Database
    Michael Epstein
    Andrew Fikes
    Christopher Frost
    JJ Furman
    Andrey Gubarev
    Christopher Heiser
    Peter Hochschild
    Sebastian Kanthak
    Eugene Kogan
    Hongyi Li
    Sergey Melnik
    David Mwaura
    David Nagle
    Rajesh Rao
    Lindsay Rolig
    Dale Woodford
    Yasushi Saito
    Christopher Taylor
    Michal Szymaniak
    Ruth Wang
    OSDI (2012)
    Preview abstract Spanner is Google's scalable, multi-version, globally-distributed, and synchronously-replicated database. It is the first system to distribute data at global scale and support externally-consistent distributed transactions. This paper describes how Spanner is structured, its feature set, the rationale underlying various design decisions, and a novel time API that exposes clock uncertainty. This API and its implementation are critical to supporting external consistency and a variety of powerful features: non-blocking reads in the past, lock-free read-only transactions, and atomic schema changes, across all of Spanner. View details