Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

people standing in front of a screen with images and a chipboard

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
1 - 15 of 10128 publications
    Socio-spatial equity analysis of relative wealth index and emergency obstetric care accessibility in urban Nigeria
    Kerry L. M. Wong
    Aduragbemi Banke-Thomas
    Tope Olubodun
    Peter M. Macharia
    Charlotte Stanton
    Narayanan Sundararajan
    Yash Shah
    Mansi Kansal
    Swapnil Vispute
    Olakunmi Ogunyemi
    Uchenna Gwacham-Anisiobi
    Jia Wang
    Ibukun-Oluwa Omolade Abejirinde
    Prestige Tatenda Makanga
    Bosede B. Afolabi
    Lenka Beňová
    Communications Medicine, 4 (2024), pp. 34
    Preview abstract Background Better geographical accessibility to comprehensive emergency obstetric care (CEmOC) facilities can significantly improve pregnancy outcomes. However, with other factors, such as affordability critical for care access, it is important to explore accessibility across groups. We assessed CEmOC geographical accessibility by wealth status in the 15 most-populated Nigerian cities. Methods We mapped city boundaries, verified and geocoded functional CEmOC facilities, and assembled population distribution for women of childbearing age and Meta’s Relative Wealth Index (RWI). We used the Google Maps Platform’s internal Directions Application Programming Interface to obtain driving times to public and private facilities. City-level median travel time (MTT) and number of CEmOC facilities reachable within 60 min were summarised for peak and non-peak hours per wealth quintile. The correlation between RWI and MTT to the nearest public CEmOC was calculated. Results We show that MTT to the nearest public CEmOC facility is lowest in the wealthiest 20% in all cities, with the largest difference in MTT between the wealthiest 20% and least wealthy 20% seen in Onitsha (26 vs 81 min) and the smallest in Warri (20 vs 30 min). Similarly, the average number of public CEmOC facilities reachable within 60 min varies (11 among the wealthiest 20% and six among the least wealthy in Kano). In five cities, zero facilities are reachable under 60 min for the least wealthy 20%. Those who live in the suburbs particularly have poor accessibility to CEmOC facilities. Conclusions Our findings show that the least wealthy mostly have poor accessibility to care. Interventions addressing CEmOC geographical accessibility targeting poor people are needed to address inequities in urban settings. View details
    Conformal Risk Control
    Anastasios N. Angelopoulos
    Stephen Bates
    Adam Fisch
    Lihua Lei
    ICLR (2024)
    Preview abstract We extend conformal prediction to control the expected value of any monotone loss function. The algorithm generalizes split conformal prediction together with its coverage guarantee. Like conformal prediction, the conformal risk control procedure is tight up to an O(1/n) factor. Worked examples from computer vision and natural language processing demonstrate the usage of our algorithm to bound the false negative rate, graph distance, and token-level F1-score. View details
    Knowledge Distillation with Perturbed Loss: From a Vanilla Teacher to a Proxy Teacher
    Rongzhi Zhang
    Chao Zhang
    Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2024), ACM, pp. 4278 - 4289
    Preview abstract Knowledge distillation is a popular technique to transfer knowledge from a large teacher model to a small student model. Typically, the student learns to imitate the teacher by minimizing the KL divergence of its output distribution with the teacher's output distribution. In this work, we argue that such a learning objective is sub-optimal because there exists a discrepancy between the teacher's output distribution and the ground truth label distribution. Therefore, forcing the student to blindly imitate the unreliable teacher output distribution leads to inferior performance. To this end, we propose a novel knowledge distillation objective PTLoss by first representing the vanilla KL-based distillation loss function via a Maclaurin series and then perturbing the leading-order terms in this series. This perturbed loss implicitly transforms the original teacher into a proxy teacher with a distribution closer to the ground truth distribution. We establish the theoretical connection between this "distribution closeness'' and the student model generalizability, which enables us to select the PTLoss's perturbation coefficients in a principled way. Extensive experiments on six public benchmark datasets demonstrate the effectiveness of PTLoss with teachers of different scales. View details
    Guidelines for Productivity in Virtual Reality
    Andrea Colaco
    ACM Interactions, 31 (2024), pp. 46-53
    Preview abstract Most of our interactions with digital content currently occur inside 2D screens, however moving from that format to immersive setups brings a paradigm shift. From content inside the screen to users inside the content. This change requires a revisit to how we blend the analog and the digital and how we transfer content between the two modes. Perhaps it even asks for new guidelines too. While different solutions appear in the space, the dynamic range only seems to widen. We can start to see what works and what does not work so well, in an empirical or ethnographic approach, beyond laboratory studies. But if we want to accelerate adoption we need to further the understanding on how current tasks can be improved. How this new form of interaction can increase their productivity. In this paper we focus on analyzing and converging what we think works, and envisioning how this new set of immersive devices and interactions can enable productivity beyond already existing tools. View details
    Using Early Readouts to Mediate Featural Bias in Distillation
    Rishabh Tiwari
    Durga Sivasubramanian
    Anmol Mekala
    Ganesh Ramakrishnan
    WACV 2024 (2024)
    Preview abstract Deep networks tend to learn spurious feature-label correlations in real-world supervised learning tasks. This vulnerability is aggravated in distillation, where a (student) model may have less representational capacity than the corresponding teacher model. Often, knowledge of specific problem features is used to reweight instances & rebalance the learning process. We propose a novel early readout mechanism whereby we attempt to predict the label using representations from earlier network layers. We show that these early readouts automatically identify problem instances or groups in the form of confident, incorrect predictions. We improve group fairness measures across benchmark datasets by leveraging these signals to mediate between teacher logits and supervised label. We extend our results to the closely related but distinct problem of domain generalization, which also critically depends on the quality of learned features. We provide secondary analyses that bring insight into the role of feature learning in supervision and distillation. View details
    Preview abstract Effective model calibration is a critical and indispensable component in developing Media Mix Models (MMMs). One advantage of Bayesian-based MMMs lies in their capacity to accommodate the information from experiment results and the modelers' domain knowledge about the ad effectiveness by setting priors for the model parameters. However, it remains ambiguous about how and which Bayesian priors should be tuned for calibration purpose. In this paper, we propose a new calibration method through model reparameterization. The reparameterized model includes Return on Ads Spend (ROAS) as a model parameter, enabling straightforward adjustment of its prior distribution to align with either experiment results or the modeler's prior knowledge. The proposed method also helps address several key challenges regarding combining MMMs and incrementality experiments. We use simulations to demonstrate that our approach can significantly reduce the bias and uncertainty in the resultant posterior ROAS estimates. View details
    Preview abstract We propose Model Swarms, a collaborative search algorithm to adapt LLM experts via swarm intelligence. Specifically, Model Swarms starts with a pool of LLM experts and a utility function. Guided by the best-found checkpoints across models, diverse LLM experts collaboratively move in the weight space and search for adapted models that optimize the utility function. Compared to existing model composition approaches, Model Swarms offers modularity, works in low-data regimes, and doesn't need assumptions about existing experts and how they should be composed. Extensive experiments demonstrate that Model Swarms could flexibly adapt LLM experts to a single dataset, multi-dataset domains, reward models, as well as diverse human preferences. Further analysis reveals that LLM experts discover previously unseen capabilities in the search process and that Model Swarms enable the weak-to-strong transition of experts through the collaborative search process. View details
    Resolving Code Review Comments with Machine Learning
    Alexander Frömmgen
    Peter Choy
    Elena Khrapko
    Marcus Revaj
    2024 IEEE/ACM 46th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) (to appear)
    Preview abstract Code reviews are a critical part of the software development process, taking a significant amount of the code authors’ and the code reviewers’ time. As part of this process, the reviewer inspects the proposed code and asks the author for code changes through comments written in natural language. At Google, we see millions of reviewer comments per year, and authors require an average of ∼60 minutes active shepherding time between sending changes for review and finally submitting the change. In our measurements, the required active work time that the code author must devote to address reviewer comments grows almost linearly with the number of comments. However, with machine learning (ML), we have an opportunity to automate and streamline the code-review process, e.g., by proposing code changes based on a comment’s text. We describe our application of recent advances in large sequence models in a real-world setting to automatically resolve code-review comments in the day-to-day development workflow at Google. We present the evolution of this feature from an asynchronous generation of suggested edits after the reviewer sends feedback, to an interactive experience that suggests code edits to the reviewer at review time. In deployment, code-change authors at Google address 7.5% of all reviewer comments by applying an ML-suggested edit. The impact of this will be to reduce the time spent on code reviews by hundreds of thousands of engineer hours annually at Google scale. Unsolicited, very positive feedback highlights that the impact of ML-suggested code edits increases Googlers’ productivity and allows them to focus on more creative and complex tasks. View details
    Creative ML Assemblages: The Interactive Politics of People, Processes, and Products
    Ramya Malur Srinivasan
    Katharina Burgdorf
    Jennifer Lena
    ACM Conference on Computer Supported Cooperative Work and Social Computing (2024) (to appear)
    Preview abstract Creative ML tools are collaborative systems that afford artistic creativity through their myriad interactive relationships. We propose using ``assemblage thinking" to support analyses of creative ML by approaching it as a system in which the elements of people, organizations, culture, practices, and technology constantly influence each other. We model these interactions as ``coordinating elements" that give rise to the social and political characteristics of a particular creative ML context, and call attention to three dynamic elements of creative ML whose interactions provide unique context for the social impact a particular system as: people, creative processes, and products. As creative assemblages are highly contextual, we present these as analytical concepts that computing researchers can adapt to better understand the functioning of a particular system or phenomena and identify intervention points to foster desired change. This paper contributes to theorizing interactions with AI in the context of art, and how these interactions shape the production of algorithmic art. View details
    Preview abstract This paper reflects on work at Google over the past decade to address common types of software safety and security defects. Our experience has shown that software safety is an emergent property of the software and tooling ecosystem it is developed in and the production environment into which it is deployed. Thus, to effectively prevent common weaknesses at scale, we need to shift-left the responsibility for ensuring safety and security invariants to the end-to-end developer ecosystem, that is, programming languages, software libraries, application frameworks, build and deployment tooling, the production platform and its configuration surfaces, and so forth. Doing so is practical and cost effective when developer ecosystems are designed with application archetypes in mind, such as web or mobile apps: The design of the developer ecosystem can address threat model aspects that apply commonly to all applications of the respective archetype, and investments to ensure safety invariants at the ecosystem level amortize across many applications. Applying secure-by-design principles to developer ecosystems at Google has achieved drastic reduction and in some cases near-zero residual rates of common classes of defects, across hundreds of applications being developed by thousands of developers. View details
    Experiencing InstructPipe: Building Multi-modal AI Pipelines via Prompting LLMs and Visual Programming
    Zhongyi Zhou
    Jing Jin
    Xiuxiu Yuan
    Jun Jiang
    Jingtao Zhou
    Yiyi Huang
    Kristen Wright
    Jason Mayes
    Mark Sherwood
    Ram Iyengar
    Na Li
    Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems, ACM, pp. 5
    Preview abstract Foundational multi-modal models have democratized AI access, yet the construction of complex, customizable machine learning pipelines by novice users remains a grand challenge. This paper demonstrates a visual programming system that allows novices to rapidly prototype multimodal AI pipelines. We first conducted a formative study with 58 contributors and collected 236 proposals of multimodal AI pipelines that served various practical needs. We then distilled our findings into a design matrix of primitive nodes for prototyping multimodal AI visual programming pipelines, and implemented a system with 65 nodes. To support users' rapid prototyping experience, we built InstructPipe, an AI assistant based on large language models (LLMs) that allows users to generate a pipeline by writing text-based instructions. We believe InstructPipe enhances novice users onboarding experience of visual programming and the controllability of LLMs by offering non-experts a platform to easily update the generation. View details
    UI Mobility Control in XR: Switching UI Positionings between Static, Dynamic, and Self Entities
    Siyou Pei
    Yang Zhang
    Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems, ACM, pp. 12 (to appear)
    Preview abstract Extended reality (XR) has the potential for seamless user interface (UI) transitions across people, objects, and environments. However, the design space, applications, and common practices of 3D UI transitions remain underexplored. To address this gap, we conducted a need-finding study with 11 participants, identifying and distilling a taxonomy based on three types of UI placements --- affixed to static, dynamic, or self entities. We further surveyed 113 commercial applications to understand the common practices of 3D UI mobility control, where only 6.2% of these applications allowed users to transition UI between entities. In response, we built interaction prototypes to facilitate UI transitions between entities. We report on results from a qualitative user study (N=14) on 3D UI mobility control using our FingerSwitches technique, which suggests that perceived usefulness is affected by types of entities and environments. We aspire to tackle a vital need in UI mobility within XR. View details
    Human I/O: Towards Comprehensive Detection of Situational Impairments in Everyday Activities
    Xingyu Bruce Liu
    Jiahao Nick Li
    Xiang 'Anthony' Chen
    Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems, ACM, pp. 18
    Preview abstract Situationally Induced Impairments and Disabilities (SIIDs) can significantly hinder user experience in everyday activities. Despite their prevalence, existing adaptive systems predominantly cater to specific tasks or environments and fail to accommodate the diverse and dynamic nature of SIIDs. We introduce Human I/O, a real-time system that detects SIIDs by gauging the availability of human input/output channels. Leveraging egocentric vision, multimodal sensing and reasoning with large language models, Human I/O achieves good performance in availability prediction across 60 in-the-wild egocentric videos in 32 different scenarios. Further, while the core focus of our work is on the detection of SIIDs rather than the creation of adaptive user interfaces, we showcase the utility of our prototype via a user study with 10 participants. Findings suggest that Human I/O significantly reduces effort and improves user experience in the presence of SIIDs, paving the way for more adaptive and accessible interactive systems in the future. View details
    Preview abstract 2022 marked the 50th anniversary of memory safety vulnerabilities, first reported by Anderson et al. Half a century later, we are still dealing with memory safety bugs despite substantial investments to improve memory unsafe languages. Like others', Google’s data and internal vulnerability research show that memory safety bugs are widespread and one of the leading causes of vulnerabilities in memory-unsafe codebases. Those vulnerabilities endanger end users, our industry, and the broader society. At Google, we have decades of experience addressing, at scale, large classes of vulnerabilities that were once similarly prevalent as memory safety issues. Based on this experience we expect that high assurance memory safety can only be achieved via a Secure-by-Design approach centered around comprehensive adoption of languages with rigorous memory safety guarantees. We see no realistic path for an evolution of C++ into a language with rigorous memory safety guarantees that include temporal safety. As a consequence, we are considering a gradual transition of C++ code at Google towards other languages that are memory safe. Given the large volume of pre-existing C++, we believe it is nonetheless necessary to improve the safety of C++ to the extent practicable. We are considering transitioning to a safer C++ subset, augmented with hardware security features like MTE. View details
    An artificial neural network to estimate the foliar and ground cover input variables of the Rangeland Hydrology and Erosion Model
    Mahmoud Saeedimoghaddam
    David Goodrich
    Mariano Hernandez
    David Phillip Guertin
    Loretta J. Metz
    Guillermo Ponce-Campos
    Haiyan Wei
    Shea Burns
    Sarah E. McCord
    Mark A. Nearing
    C. Jason Williams
    Carrie-Ann Houdeshell
    Mashrekur Rahman
    Menberu B. Meles
    Steve Barker
    Journal of Hydrology (2024)
    Preview abstract Models like the Rangeland Hydrology and Erosion Model (RHEM) are useful for estimating soil erosion, however, they rely on input parameters that are sometimes difficult or expensive to measure. Specifically, RHEM requires information about foliar and ground cover fractions that generally must be measured in situ, which makes it difficult to use models like RHEM to produce erosion or soil risk maps for areas exceeding the size of a hillslope such as a large watershed. We previously developed a deep learning emulator of RHEM that has low computational expense and can, in principle, be run over large areas (e.g., over the continental US). In this paper, we develop a deep learning model to estimate the RHEM ground cover inputs from remote sensing time series, reducing the need for extensive field surveys to produce erosion maps. We achieve a prediction accuracy on hillslope runoff of r2=0.9, and on soil loss and sediment yield of r2 = 0.4 at 66,643 field locations within the US. We demonstrate how this approach can be used for mapping by developing runoff, soil loss, and sediment yield maps over a 1356 km2 region of interest in Nebraska. View details