In 2011, I co-founded the Google Brain project/team, focused on making progress towards intelligent machines. Since then, my individual work has focused on research, systems and applications for AI and ML, as well as steering the direction of our broader AI/ML and computer science research community. For the past few years, I’ve had the great pleasure to write a blog post early each year summarizing many pieces of the public work done by amazing colleagues and researchers over the previous year in our research teams (despite the similar-sounding titles, these annual blog posts are each quite different!).
- Jan 2023: Google Research, 2022 & beyond: Language, vision and generative models (part 1 of a 9-part series)
- Jan 2022: Google Research: Themes from 2021 and Beyond
- Jan 2021: Google Research: Looking Back at 2020, and Forward to 2021
- Jan 2020: Google Research: Looking Back at 2019, and Forward to 2020 and Beyond
- Jan 2019: Looking Back at Google’s Research Efforts in 2018
- Jan 2018: The Google Brain Team — Looking Back on 2017 (Part 1 of 2) … Part 2
- Jan 2017: The Google Brain Team — Looking Back on 2016
- Research leadership. Steering the research directions of the Google Brain team, Google Research, and now Google DeepMind (with many others!). See year-end blog post links above for more details about this, which includes advances in things like the Transformer architecture, machine learning systems (DistBelief, TensorFlow, Pathways), TPUs, the Inception model, word2vec, seq2seq models, neural machine translation, distillation, neural architecture search/AutoML, RankBrain, BERT, TensorFlow, JAX, Pathways, PaLM, PaLM 2, PaLI, PaLM-E, MedPalm, NeRF, quantum computing advances, ML for chip design, computational photography (e.g. Night Sight & Magic Eraser), flood forecasting, Responsible AI research areas like bias, fairness and interpretability, medical diagnostics, auction theory, open source software and datasets, accessibility, weather forecasting, ML for robotics, connectomics, genomics, and more, as well as research impact in products across nearly all of Google, including Search, Ads, YouTube, GMail, Workspace, Maps, News, Photos, Translate, Android, Cloud, Pixel, Waymo, and many more products.
- Computer systems for ML. The design and implementation of three generations of systems for training and deploying of deep learning models: DistBelief, TensorFlow, and Pathways.
In DistBelief, we explored large-scale, highly distributed systems and asynchronous training algorithms to enable ML models to be trained on large amounts of data, even on the relatively slow, non-ML-optimized hardware of the time (we trained models with 2B non-embedding parameters at a time when the largest models reported in the literature were 10M to 50M parameters). The system was used for hundreds of projects within Google and had widespread use across many Google products. Some of the earliest research work we did using DistBelief was exploring unsupervised learning on video frames to see what sorts of representations would emerge, in Building high-level features using large scale unsupervised learning, a.k.a "the cat neuron paper". We also used DistBelief to develop word2vec, various speech recognition models, multimodal work like DeViSE, and early embedding models like RankBrain.
TensorFlow: I was one of the primary designers and implementors of the initial TensorFlow system. I made the case that we should open-source Tensorflow, and we released it as an open source project in 2015, hosted on GitHub. It is used by millions of researchers and developers all over the world for exploring and creating ML and AI systems on platforms ranging from tiny embedded systems, to phones, desktop computers, and ML supercomputers. For detailed papers on TensorFlow, see Tensorflow: Large-scale machine learning on heterogeneous distributed systems (white paper) and TensorFlow: A System for Large-Scale Machine Learning (OSDI 2016).
Pathways is designed to support large-scale, multimodal, sparse architectures that are capable of solving thousands or millions of tasks. I was one of the original designers and implementers, and a paper about the systems research aspects of Pathways appeared in MLSys 2022 as Pathways: Asynchronous Distributed Dataflow for ML. The underlying system software has been used for work like the PaLM language models (which underlie work like Med-PaLM, PaLM-E for robotics, PaLI, and other downstream uses).
- Language modeling. I have worked on many different projects related to language modeling, starting with work in 2007 that trained 300 billion parameter language models on trillions of tokens of text (Large language models in machine translation), demonstrating significant improvements in translation quality.
I was a co-author on a pair of papers that introduced an approach of learning distributed representations of words that is now commonly called word2vec (Efficient estimation of word representations in vector space and Distributed representations of words and phrases and their compositionality).
I was one of many who helped to convert the Google Translate system over to using a neural machine translation system, with further significant gains to translation quality. See Google’s neural machine translation system: Bridging the gap between human and machine translation (2016) and Google’s multilingual neural machine translation system: Enabling zero-shot translation. Gideon Lewis-Kraus of The NY Times magazine wrote an in-depth feature about the rollout of the neural machine translation system in Google Translate in The Great AI Awakening.
Part of the infrastructure work on Pathways is designed to enable scaling training of larger models on larger and more diverse datasets. I worked on the PaLM language model work, and I am one of the co-leads of the Gemini effort, which is building next-generation multimodal models that can use tools and APIs to enable more capable models that can be used in a variety of Google products and application areas.
- Distillation. I am one of the co-creators of a machine learning technique called distillation, a now-widely-used approach for transferring the knowledge from one neural network to another. It is often used to create smaller, much more efficient models for inference from larger, more unwieldy models, and it can also be used to transfer knowledge from one neural network architecture to a completely different architecture. See Distilling the Knowledge in a Neural Network.
- Sparse models. I have been involved in a series of work on sparse model architectures for neural networks, including Outrageously large neural networks: The sparsely-gated mixture-of-experts layer (2017) and Designing Effective Sparse Expert Models. A review of approaches for sparse models appears in A Review of Sparse Expert Models in Deep Learning.
- AI for ASIC chip design. I have worked on research on how to apply reinforcement learning to the problem of placement and routing in ASIC chip design. We have shown that it is possible to get performance that is as good or better than human performance on the problem of chip floorplanning in a system that runs in a few hours. Our work here was published in Nature and has been used for multiple generations of Google’s TPU ML accelerators.
- ML for healthcare. I have worked on the use of AI and machine learning in healthcare settings. We have done work showing that machine learning on deidentified medical records can produce useful and actionable suggestions for clinicians, published as Scalable and Accurate Deep Learning with Electronic Health Records. The broader research community at Google has also done work on applying machine learning across many different problems in health, including medical imaging diagnostics, genomics, medical note transcription and summarization, and novel sensing (see health sections of year-in-review blog posts above). I’ve also collaborated on a couple of review articles in this space. One assessed some of the most promising directions for integrating deep learning into healthcare settings, and was published in Nature Medicine as A Guide to Deep Learning in Healthcare. The other was a NEJM article titled Machine Learning in Medicine.
- ML for computer systems. I have worked with many others on advancing the use of machine learning for tackling computer systems problems. Among these are device placement using reinforcement learning to map abstract ML computation graphs onto a set of physical devices in order to give the best performance (and some follow-on work on a hierarchical version of this), and the use of learned index structures in database systems instead of traditional data structures like B-trees and hash tables.
- Energy efficiency of machine learning. I have helped push forward Google’s TPU efforts, identifying fairly early in the widespread use of deep learning that creating efficient systems was going to require building customized accelerator hardware, leading to a long line of TPU processors. TPUv1 (In-datacenter Performance Analysis of a Tensor Processing Unit) targeted inference computations and was about 30X - 80X better performance/Watt than contemporary CPUs and GPUs. Subsequent TPU generations target both training and inference in large-scale ML accelerator systems and are crucial to much of the machine learning research and product applications of ML at Google. They are available to external entities as Google Cloud TPUs.
Carbon emissions of machine learning training is an area that is rife with misinformation due to the prevalence of flawed and inaccurate estimates, so I have also worked with others to correct some of this misinformation and put actual measured data into the literature. See Carbon emissions and large neural network training, especially appendices C and D, and The carbon footprint of machine learning training will plateau, then shrink (if ML researchers adopt best practices). I gave a talk on some of these issues at the 2022 MIT Climate Impacts of Computing and Communications workshop.
- Google Search. The design and implementation of five generations of our crawling, indexing, and query serving systems, covering two and three orders of magnitude growth in number of documents searched, number of queries handled per second, and frequency of updates to the system. We did not publish research papers on most aspects of this, but I gave a talk at WSDM'09 about some of the issues involved in building large-scale retrieval systems (slides).
- Search ranking algorithms. Some aspects of our search ranking algorithms, notably improved handling for dealing with off-page signals such as anchortext.
- Search ranking prototyping system. The design and implementation of prototyping infrastructure for rapid development and experimentation with new ranking algorithms.
- MapReduce. The design and implementation of MapReduce, a system for simplifying the development of large-scale data processing applications. A paper about MapReduce appeared in OSDI'04. MapReduce is used extensively within Google, and provided the inspiration for external open-source projects like Hadoop, as well as follow-on projects like Flume.
- BigTable. The design and implementation of BigTable, a large-scale semi-structured storage system used underneath a number of Google products. A paper about BigTable appeared in OSDI'06. BigTable is used by hundreds of teams at Google and sits underneath dozens of products. It is available externally as Cloud Bigtable.
- Spanner. The design and implementation of Spanner, a geographically-distributed worldwide storage system that can provide strong consistency guarantees through the use of Paxos and highly synchronized clocks in multiple data centers. A paper about Spanner appeared in OSDI’12. Spanner is used extensively for hundreds of projects within Google, underlies a large fraction of our products, and is available for external uses as Google’s Cloud Spanner product.
- Google Ads. I was part of a group of three people who did the design and implementation of the initial version of Google's advertising serving system.
- AdSense. The initial development of Google's AdSense for Content product (involving both the production serving system design and implementation as well as work on developing and improving the quality of ad selection based on the contents of pages).
- Protocol buffers. The development of Protocol Buffers, a way of encoding structured data in an efficient yet extensible format, and a compiler that generates convenient wrappers for manipulating the objects in a variety of languages. Protocol Buffers are used extensively at Google for almost all RPC protocols, and for storing structured information in a variety of persistent storage systems. A version of the protocol buffer implementation has been open-sourced and is available at https://github.com/protocolbuffers/protobuf/, and a developer site with documentation and more details is at https://protobuf.dev/.
- Google News. Some of the initial production serving system work for the Google News product, working with Krishna Bharat to move the prototype system he put together into a deployed system.
- Job scheduling system. The design and implementation of the first generation of our automated job scheduling system for managing a cluster of machines.
- Timeseries analysis system. The initial design and implementation of a system for analyzing complex timeseries data. This system is used extensively by dozens of Google teams to support various use cases like suggested completions, recommendations, etc. The system is available for Cloud customers to analyze their own datasets via the Timeseries Insights API.
- Google Translate. Some of the production system design for Google Translate, our statistical machine translation system. In particular, I designed and implemented a system for distributed high-speed access to very large language models (too large to fit in memory on a single machine), and then later helped with the transition to using neural machine translation models.
- LevelDB. The design and implementation of LevelDB, a high performance key-value store that we released as an open-source project. It is used in a wide variety of projects including Google Chrome.
- Code search. Some internal tools to make it easy to rapidly search our internal source code repository. Many of the ideas from this internal tool were incorporated into our Google Code Search product, including the ability to use regular expressions for searching large corpora of source code.
I received a Ph.D. in computer science from the University of Washington in 1996, working on compiler optimizations for object-oriented languages advised by Craig Chambers. I received a B.S. in computer science and economics (summa cum laude) from the University of Minnesota in 1990 (doing honors theses on parallel training of neural networks and the economic impact of HIV/AIDS).
From 1996 to 1999, I worked for Digital Equipment Corporation's Western Research Lab in Palo Alto, where I worked on low-overhead profiling tools, design of profiling hardware for out-of-order microprocessors, and web-based information retrieval. From 1990 to 1991, I worked for the World Health Organization's Global Programme on AIDS, developing software to do statistical modeling, forecasting, and analysis of the HIV pandemic. In high school and during the summers in college, I worked first at the Centers for Disease Control and later at the World Health Organization developing a series of versions of software called Epi Info (wikipedia) for analyzing epidemiological data (still one of my most cited works).
In 2009, I was elected to the National Academy of Engineering, and in 2016, I was elected as a member of the American Academy of Arts and Sciences. I was also named a Fellow of the Association for Computing Machinery (ACM) and a Fellow of the American Association for the Advancement of Sciences (AAAS). I am a recipient of the ACM Prize in Computing (2012, with my long-time colleague Sanjay Ghemawat), the IEEE John von Neumann medal, and the Mark Weiser Award.
James Somers of the New Yorker wrote a delightful article in 2018 about me and my long-time collaborator Sanjay Ghemawat and how we work together: The Friendship That Made Google Huge.
Selected slides/talks:Note that talks with similar titles sometimes end up having different mixes of content.
- MIT Climate Impacts of Computing and Communications workshop, April 2022: Sustainable Computation and Machine Learning Platforms at Google
- 58th Design Automation Conference keynote, January, 2022: The Potential of Machine Learning for Hardware Design
- TED talk, 2021: AI isn't as smart as you think -- but it could be
- Humans of AI discussion, 2021: S1E17: Jeff Dean with Devi Parikh on Humans of AI: Stories, Not Stats
- Virtual talk at TU @ Berlin, April, 2021: Tackling Grand Challenge Engineering Problems with Deep Learning
- Khipu 2019, November, 2019: Deep Learning to Solve Challenging Problems
- UW Allen School Distinguished Lecture, October, 2019: Deep Learning to Solve Challenging Problems
- Stanford Medicine Big Data | Precision Health conference keynote, 2019: AI in Healthcare
- Berkeley EECS Colloquium, November, 2018: Deep Learning to Solve Challenging Problems
- Heidelberg Laureate Forum, September, 2018: Deep Learning and the Grand Engineering Challenges
- ETH Zurich Lecture, September, 2018: Deep Learning to Solve Challenging Problems
- Heidelberg University talk, September, 2018: Deep Learning to Solve Challenging Problems
- Heidelberg Laureate Forum interview of me by Dr. Tom Crawford, September, 2018: Interview: How I Got Started Programming
- Deep Learning Indaba, September 2018: TensorFlow and Real Life Machine Learning (long: ~2 hrs)
- SysML 2018 invited talk, February, 2018: Systems and Machine Learning Symbiosis
- Talk at YC AI meeting, August, 2017: Building Intelligent Systems with Large Scale Deep Learning (slides)
- AI Frontiers Conference, January 2017: Trends and Developments in Deep Learning Research
- TEDxLA talk, December, 2016: How Will Artificial Intelligence Affect Your Life
- ACM Tech Talks, July, 2016: Large Scale Deep Learning with TensorFlow for Building Intelligent Systems
- First Person, Palo Alto Online, March, 2016: First Person: A Conversation with Jeff Dean
- UW Distinguished Lecture series, February, 2015: Large-Scale Deep Learning For Building Intelligent Computer Systems
- Recommendation Systems (RecSys) keynote, October, 2014: Large Scale Machine Learning for Predictive Tasks (and part 2: there were issues in the recorded live stream so it got split into two)
- Berkeley AMPLab Cloud Seminar talk, March, 2012: Achieving Rapid Response Times in Large Online Services
- Stanford Computer Science Department Distinguished Computer Scientist Lecture lecture, November, 2010: Building Software Systems at Google and Lessons Learned
- Symposium on Cloud Computing (SOCC) keynote, June, 2010: Evolution and Future Directions of Large-scale Storage and Computation Systems at Google
- Web Search and Data Mining Conference (WSDM) keynote, February, 2009: Challenges in Building Large-Scale Information Retrieval Systems
- Google Faculty Summit talk, July, 2008: Some Potential Areas for Future Research
- Google I/O Developers Conference, May, 2008: Underneath the Covers at Google: Current Systems and Future Directions
- Stanford CS295 class lecture, Spring, 2007: Software Engineering Advice from Building Large-Scale Distributed Systems
- UW Colloquium, 2005: BigTable: A Distributed Structured Storage System
- UW Colloquium, 2004: Google: A Behind the Scenes Look
- Outstanding Paper Award, MLSys 2022 (for Pathways: Asynchronous Distributed Dataflow for ML)
- SIGOPS Hall of Fame Award, 2022 (for Spanner: Google’s Globally Distributed Database System at OSDI 2012)
- Best Paper Award, EuroSys 2018 (for Dynamic Control Flow in Large-Scale Machine Learning)
- SIGOPS Hall of Fame Award, 2016 (for Bigtable: A Distributed Storage System for Structured Data)
- SIGOPS Hall of Fame Award, 2015 (for MapReduce: Simplified Data Processing on Large Clusters)
- Best Paper Award, OSDI 2012 (for Spanner: Google’s Globally Distributed Database System)
- 10-year Retrospective Most Influential Paper Award from OOPSLA 2007 (for Call Graph Construction in Object-Oriented Languages, 1997).
- Best Paper Award, OSDI 2006 (for Bigtable: A Distributed Storage System for Structured Data)
- 10-year Retrospective Most Influential Paper Award from PLDI 2005 (for Selective Specialization for Object-Oriented Languages, 1995)
- Best Paper Award, SOSP 1997 (for Continuous Profiling: Where Have All the Cycles Gone?)
Personal:I've lived in lots of places in my life: Honolulu, HI; Manila, The Phillipines; Boston, MA; West Nile District, Uganda; Boston (again); Little Rock, AR; Hawaii (again); Minneapolis, MN; Mogadishu, Somalia; Atlanta, GA; Minneapolis (again); Geneva, Switzerland; Seattle, WA; and (currently) Palo Alto, CA. I'm hard-pressed to pick a favorite, though: each place has its plusses and minuses.
One of my life goals is to play soccer and basketball on every continent. So far, I've done so in North America, South America, Europe, Asia, Oceania, and Africa. I'm worried that Antarctica might be tough, though.