Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

people standing in front of a screen with images and a chipboard

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
1 - 15 of 10132 publications
    KATch: A Fast Symbolic Verifier for NetKAT
    Mark Moeller
    Jules Jacobs
    Olivier Savary Belanger
    David Darais
    Cole Schlesinger
    Nate Foster
    Alexandra Silva
    Programming Languages and Implementation (PLDI) (2024) (to appear)
    Preview abstract We develop new data structures and algorithms for checking verification queries in NetKAT, a domain-specific language for specifying the behavior of network data planes. Our results extend the techniques obtained in prior work on symbolic automata and provide a framework for building efficient and scalable verification tools. We present \KATch, an implementation of these ideas in Scala, including extended logical operators that are useful for expressing network-wide specifications and optimizations that construct a bisimulation quickly or generate a counter-example showing that none exists. We evaluate the performance of our implementation on real-world and synthetic benchmarks, verifying properties such as reachability and slice isolation, typically returning a result in well under a second, which is orders of magnitude faster than previous approaches. View details
    Validation of a deep learning system for the detection of diabetic retinopathy in Indigenous Australians
    Mark Chia
    Fred Hersch
    Pearse Keane
    Angus Turner
    British Journal of Ophthalmology, 108 (2024), pp. 268-273
    Preview abstract Background/aims: Deep learning systems (DLSs) for diabetic retinopathy (DR) detection show promising results but can underperform in racial and ethnic minority groups, therefore external validation within these populations is critical for health equity. This study evaluates the performance of a DLS for DR detection among Indigenous Australians, an understudied ethnic group who suffer disproportionately from DR-related blindness. Methods: We performed a retrospective external validation study comparing the performance of a DLS against a retinal specialist for the detection of more-than-mild DR (mtmDR), vision-threatening DR (vtDR) and all-cause referable DR. The validation set consisted of 1682 consecutive, single-field, macula-centred retinal photographs from 864 patients with diabetes (mean age 54.9 years, 52.4% women) at an Indigenous primary care service in Perth, Australia. Three-person adjudication by a panel of specialists served as the reference standard. Results: For mtmDR detection, sensitivity of the DLS was superior to the retina specialist (98.0% (95% CI, 96.5 to 99.4) vs 87.1% (95% CI, 83.6 to 90.6), McNemar’s test p<0.001) with a small reduction in specificity (95.1% (95% CI, 93.6 to 96.4) vs 97.0% (95% CI, 95.9 to 98.0), p=0.006). For vtDR, the DLS’s sensitivity was again superior to the human grader (96.2% (95% CI, 93.4 to 98.6) vs 84.4% (95% CI, 79.7 to 89.2), p<0.001) with a slight drop in specificity (95.8% (95% CI, 94.6 to 96.9) vs 97.8% (95% CI, 96.9 to 98.6), p=0.002). For all-cause referable DR, there was a substantial increase in sensitivity (93.7% (95% CI, 91.8 to 95.5) vs 74.4% (95% CI, 71.1 to 77.5), p<0.001) and a smaller reduction in specificity (91.7% (95% CI, 90.0 to 93.3) vs 96.3% (95% CI, 95.2 to 97.4), p<0.001). Conclusion: The DLS showed improved sensitivity and similar specificity compared with a retina specialist for DR detection. This demonstrates its potential to support DR screening among Indigenous Australians, an underserved population with a high burden of diabetic eye disease. View details
    Creative ML Assemblages: The Interactive Politics of People, Processes, and Products
    Ramya Malur Srinivasan
    Katharina Burgdorf
    Jennifer Lena
    ACM Conference on Computer Supported Cooperative Work and Social Computing (2024) (to appear)
    Preview abstract Creative ML tools are collaborative systems that afford artistic creativity through their myriad interactive relationships. We propose using ``assemblage thinking" to support analyses of creative ML by approaching it as a system in which the elements of people, organizations, culture, practices, and technology constantly influence each other. We model these interactions as ``coordinating elements" that give rise to the social and political characteristics of a particular creative ML context, and call attention to three dynamic elements of creative ML whose interactions provide unique context for the social impact a particular system as: people, creative processes, and products. As creative assemblages are highly contextual, we present these as analytical concepts that computing researchers can adapt to better understand the functioning of a particular system or phenomena and identify intervention points to foster desired change. This paper contributes to theorizing interactions with AI in the context of art, and how these interactions shape the production of algorithmic art. View details
    Preview abstract Recently proposed long-form question answering (QA) systems, supported by large language models (LLMs), have shown promising capabilities. Yet, attributing and verifying their generated abstractive answers can be difficult, and automatically evaluating their accuracy remains an ongoing challenge. In this paper, we introduce a new QA task for answering multi-answer questions by summarizing multiple diverse sources in a semi-extractive fashion. Specifically, Semi-extractive Multi-source QA (SEMQA) requires models to output a comprehensive answer while mixing between factual quoted spans---copied verbatim from given input sources---and non-factual free-text connectors that glue these spans together into a single cohesive passage. This setting bridges the gap between the outputs of well-grounded but constrained extractive QA systems and more fluent but harder to attribute fully abstractive answers. Particularly, it enables a new mode for language models that leverages their advanced language generation capabilities, while also producing fine in-line attributions by-design that are easy to verify, interpret, and evaluate. To study this task, we create the first dataset of this kind with human-written semi-extractive answers to natural and generated questions, and define text-based evaluation metrics. Experimenting with several LLMs in various settings, we find this task to be surprisingly challenging, demonstrating the importance of our work for developing and studying such consolidation capabilities. View details
    Preview abstract Millions of people turn to Google Search each day for information on things as diverse as new cars or flu symptoms. The terms that they enter contain valuable information on their daily intent and activities, but the information in these search terms has been difficult to fully leverage. User-defined categorical filters have been the most common way to shrink the dimensionality of search data to a tractable size for analysis and modeling. In this paper we present a new approach to reducing the dimensionality of search data while retaining much of the information in the individual terms without user-defined rules. Our contributions are two-fold: 1) we introduce SLaM Compression, a way to quantify search terms using pre-trained language models and create a representation of search data that has low dimensionality, is memory efficient, and effectively acts as a summary of search, and 2) we present CoSMo, a Constrained Search Model for estimating real world events using only search data. We demonstrate the efficacy of our contributions by estimating with high accuracy U.S. automobile sales and U.S. flu rates using only Google Search data. View details
    Preview abstract Algorithms for the computation of alternative routes in road networks power many geographic navigation systems. A good set of alternative routes offers meaningful options to the user of the system and can support applications such as routing that is robust to failures (e.g., road closures, extreme traffic congestion, etc.) and routing with diverse preferences and objective functions. Algorithmic techniques for alternative route computation include the penalty method, via-node type algorithms (which deploy bidirectional search and finding plateaus), and, more recently, electrical-circuit based algorithms. In this work we focus on the practically important family of via-node type algorithms and we aim to produce high quality alternative routes for road netowrks. We study alternative route computation in the presence of a fast routing infrastructure that relies on hierarchical routing (namely, CRP). We propose new approaches that rely on deep learning methods. Our training methodology utilizes the hierarchical partition of the graph and builds models to predict which boundary road segments in the partition should be crossed by the alternative routes. We describe our methods in detail and evaluate them against the previously studied architectures, as well as against a stronger baseline that we define in this work, showing improvements in quality in the road networks of Seattle, Paris, and Bangalore. View details
    Preview abstract The articles delves into the promise of AI in business intelligence. It briefly reviews the evolution of BI and various Cloud tools, followed by the paradigm shift in how data is consumed. While AI brings huge potential, the article covers areas that enterprises must exercise caution over, when building intelligent agents to answer data questions. View details
    Preview abstract Facilitated by large language models (LLMs), personalized text generation has become a rapidly growing research direction. Most existing studies focus on designing specialized models for a particular domain, or they require fine-tuning the LLMs to generate personalized text. We consider a typical scenario in which the large language model, which generates personalized output, is frozen and can only be accessed through APIs. Under this constraint, all one can do is to improve the input text (i.e., text prompts) sent to the LLM, a procedure that is usually done manually. In this paper, we propose a novel method to automatically revise prompts for personalized text generation. The proposed method takes the initial prompts generated by a state-of-the-art, multistage framework for personalized generation and rewrites a few critical components that summarize and synthesize the personal context. The prompt rewriter employs a training paradigm that chains together supervised learning (SL) and reinforcement learning (RL), where SL reduces the search space of RL and RL facilitates end-to-end training of the rewriter. Using datasets from three representative domains, we demonstrate that the rewritten prompts outperform both the original prompts and the prompts optimized via supervised learning or reinforcement learning alone. In-depth analysis of the rewritten prompts shows that they are not only human readable, but also able to guide manual revision of prompts when there is limited resource to employ reinforcement learning to train the prompt rewriter, or when it is costly to deploy an automatic prompt rewriter for inference. View details
    Preview abstract The evolution of AI is a pivotal moment in history, but it’s not the first time we have experienced technological advances that have changed how humans work. By looking at the advances in automobiles, we are reminded of the importance of focusing on our developers' needs and goals. View details
    The Case for Validating Inputs in Software-Defined WANs
    Rishabh Iyer
    Isaac Keslassy
    Sylvia Ratnasamy
    The 23rd ACM Workshop on Hot Topics in Networks (HOTNETS ’24), ACM, Irvine, CA (2024) (to appear)
    Preview abstract We highlight a problem that the networking community has largely overlooked: ensuring that the inputs to network controllers in software- defined WANs are accurate. We we show that “incorrect” inputs are a common cause of major outages in practice and propose new directions to address these. View details
    Sleep patterns and risk of chronic disease as measured by long-term monitoring with commercial wearable devices in the All of Us Research Program
    Neil S. Zheng
    Jeffrey Annis
    Hiral Master
    Lide Han
    Karla Gleichauf
    Melody Nasser
    Peyton Coleman
    Stacy Desine
    Douglas M. Ruderfer
    John Hernandez
    Logan D. Schneider
    Evan L. Brittain
    Nature Medicine (2024)
    Preview abstract Poor sleep health is associated with increased all-cause mortality and incidence of many chronic conditions. Previous studies have relied on cross-sectional and self-reported survey data or polysomnograms, which have limitations with respect to data granularity, sample size and longitudinal information. Here, using objectively measured, longitudinal sleep data from commercial wearable devices linked to electronic health record data from the All of Us Research Program, we show that sleep patterns, including sleep stages, duration and regularity, are associated with chronic disease incidence. Of the 6,785 participants included in this study, 71% were female, 84% self-identified as white and 71% had a college degree; the median age was 50.2 years (interquartile range = 35.7, 61.5) and the median sleep monitoring period was 4.5 years (2.5, 6.5). We found that rapid eye movement sleep and deep sleep were inversely associated with the odds of incident atrial fibrillation and that increased sleep irregularity was associated with increased odds of incident obesity, hyperlipidemia, hypertension, major depressive disorder and generalized anxiety disorder. Moreover, J-shaped associations were observed between average daily sleep duration and hypertension, major depressive disorder and generalized anxiety disorder. These findings show that sleep stages, duration and regularity are all important factors associated with chronic disease development and may inform evidence-based recommendations on healthy sleeping habits. View details
    Preview abstract Examines how a Google R&D programme sought to accelerate a future of safer, cheaper and more ubiquitous fusion and other nuclear energy. Discusses how the programme was started, its major components: fusion, edge-of-technology, and policy advocacy supporting innovation. Shows successful exits for each part. Beyond telling the sotry, an intents is to show how to move the needle, and get people to think about how they might also help, and show Google has made a difference. Timing of publication marks the 10th anniversary of programme's start. View details
    Drug Design on Quantum Computers
    Raffaele Santagati
    Alán Aspuru-Guzik
    Matthias Degroote
    Leticia Gonzalez
    Elica Kyoseva
    Nikolaj Moll
    Markus Oppel
    Robert Parrish
    Michael Streif
    Christofer Tautermann
    Horst Weiss
    Nathan Wiebe
    Clemens Utschig-Utschig
    Nature Physics (2024)
    Preview abstract The promised industrial applications of quantum computers often rest on their anticipated ability to perform accurate, efficient quantum chemical calculations. Computational drug discovery relies on accurate predictions of how candidate drugs interact with their targets in a cellular environment involving several thousands of atoms at finite temperatures. Although quantum computers are still far from being used as daily tools in the pharmaceutical industry, here we explore the challenges and opportunities of applying quantum computers to drug design. We discuss where these could transform industrial research and identify the substantial further developments needed to reach this goal. View details
    Preview abstract This paper reflects on work at Google over the past decade to address common types of software safety and security defects. Our experience has shown that software safety is an emergent property of the software and tooling ecosystem it is developed in and the production environment into which it is deployed. Thus, to effectively prevent common weaknesses at scale, we need to shift-left the responsibility for ensuring safety and security invariants to the end-to-end developer ecosystem, that is, programming languages, software libraries, application frameworks, build and deployment tooling, the production platform and its configuration surfaces, and so forth. Doing so is practical and cost effective when developer ecosystems are designed with application archetypes in mind, such as web or mobile apps: The design of the developer ecosystem can address threat model aspects that apply commonly to all applications of the respective archetype, and investments to ensure safety invariants at the ecosystem level amortize across many applications. Applying secure-by-design principles to developer ecosystems at Google has achieved drastic reduction and in some cases near-zero residual rates of common classes of defects, across hundreds of applications being developed by thousands of developers. View details
    Efficient data generation for source-grounded information-seeking dialogs: A use case for meeting transcripts
    Lotem Golany
    Maya Mamo
    Nimrod Parasol
    Omer Vandsburger
    Nadav Bar
    Ido Dagan
    Findings of the Association for Computational Linguistics: EMNLP 2024, Association for Computational Linguistics, Miami, Florida, USA, pp. 1908-1925
    Preview abstract Automating data generation with Large Language Models (LLMs) has become increasingly popular. In this work, we investigate the feasibility and effectiveness of LLM-based data generation in the challenging setting of source-grounded information-seeking dialogs, with response attribution, over long documents. Our source texts consist of long and noisy meeting transcripts, adding to the task complexity. Since automating attribution remains difficult, we propose a semi-automatic approach: dialog queries and responses are generated with LLMs, followed by human verification and identification of attribution spans. Using this approach, we created MISeD – Meeting Information Seeking Dialogs dataset – a dataset of information-seeking dialogs focused on meeting transcripts. Models finetuned with MISeD demonstrate superior performance compared to off-the-shelf models, even those of larger size. Finetuning on MISeD gives comparable response generation quality to finetuning on fully manual data, while improving attribution quality and reducing time and effort. View details