Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

people standing in front of a screen with images and a chipboard

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
1 - 15 of 10129 publications
    Preview abstract Understanding complex relationships between human behavior and local contexts is crucial for various applications in public health, social science, and environmental studies. Traditional approaches often make use of small sets of manually curated, domain-specific variables to represent human behavior, and struggle to capture these intricate connections, particularly when dealing with diverse data types. To address this challenge, this work introduces a novel approach that leverages the power of graph neural networks (GNNs). We first construct a large dataset encompassing human-centered variables aggregated at postal code and county levels across the United States. This dataset captures rich information on human behavior (internet search behavior and mobility patterns) along with environmental factors (local facility availability, temperature, and air quality). Next, we propose a GNN-based framework designed to encode the connections between these diverse features alongside the inherent spatial relationships between postal codes and their containing counties. We then demonstrate the effectiveness of our approach by benchmarking the model on 27 target variables spanning three distinct domains: health, socioeconomic factors, and environmental measurements. Through spatial interpolation, extrapolation, and super-resolution tasks, we show that the proposed method can effectively utilize the rich feature set to achieve accurate predictions across diverse geospatial domains. View details
    Preview abstract Large Language Models (LLMs) may offer transformative opportunities for text input, especially for physically demanding modalities like handwriting. We studied a form of abbreviated handwriting by designing, developing and evaluating a prototype, named SkipWriter, that convert handwritten strokes of a variable-length, prefix- based abbreviation (e.g., “ho a y” as handwritten strokes) into the intended full phrase (e.g., “how are you” in the digital format) based on preceding context. SkipWriter consists of an in-production hand-writing recognizer and a LLM fine-tuned on this skip-writing task. With flexible pen input, SkipWriter allows the user to add and revise prefix strokes when predictions don’t match the user’s intent. An user evaluation demonstrated a 60% reduction in motor movements with an average speed of 25.78 WPM. We also showed that this reduction is close to the ceiling of our model in an offline simulation. View details
    Preview abstract Computing efficient traffic signal plans is often based on the amount of traffic in an intersection, its distribution over the various intersection movements and hours as well as on performance metrics such as traffic delay. In their simple and typical form plans are fixed in the same hour over weekdays. This allows low operation costs without the necessity for traffic detection and monitoring tools. A critical factor on the potential efficiency of such plans is the similarity of traffic patterns over the days along each of the intersection movements. We refer to such similarity as the traffic stability of the intersection and define simple metrics to measure it based on traffic volume and traffic delay. In this paper, we propose an automatic probe data based method, for city-wide estimation of traffic stability. We discuss how such measures can be used for signal planning such as in selecting plan resolution or as an indication as which intersections can benefit from dynamic but expensive traffic detection tools. We also identify events of major changes in traffic characteristics of an intersection. We demonstrate the framework by using real traffic statistics to study the traffic stability in the city of Haifa along its 162 intersections. We study the impact of the time of day on the stability, detect major changes in traffic and find intersections with high and low stability. View details
    SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL
    Shannon Bales
    Matthew Brown
    Jean-Daniel Browne
    Brandon Dolphin
    Romit Kudtarkar
    Andrey Litvinov
    Jingchi Ma
    John Morcos
    Michael Shen
    David Wilhite
    Xi Wu
    Lulan Yu
    Proc. VLDB Endow. (2024), pp. 4051-4063 (to appear)
    Preview abstract SQL has been extremely successful as the de facto standard language for working with data. Virtually all mainstream database-like systems use SQL as their primary query language. But SQL is an old language with significant design problems, making it difficult to learn, difficult to use, and difficult to extend. Many have observed these challenges with SQL, and proposed solutions involving new languages. New language adoption is a significant obstacle for users, and none of the potential replacements have been successful enough to displace SQL. In GoogleSQL, we’ve taken a different approach - solving SQL’s problems by extending SQL. Inspired by a pattern that works well in other modern data languages, we added piped data flow syntax to SQL. The results are transformative - SQL becomes a flexible language that’s easier to learn, use and extend, while still leveraging the existing SQL ecosystem and existing userbase. Improving SQL from within allows incrementally adopting new features, without migrations and without learning a new language, making this a more productive approach to improve on standard SQL. View details
    Preview abstract This paper reflects on work at Google over the past decade to address common types of software safety and security defects. Our experience has shown that software safety is an emergent property of the software and tooling ecosystem it is developed in and the production environment into which it is deployed. Thus, to effectively prevent common weaknesses at scale, we need to shift-left the responsibility for ensuring safety and security invariants to the end-to-end developer ecosystem, that is, programming languages, software libraries, application frameworks, build and deployment tooling, the production platform and its configuration surfaces, and so forth. Doing so is practical and cost effective when developer ecosystems are designed with application archetypes in mind, such as web or mobile apps: The design of the developer ecosystem can address threat model aspects that apply commonly to all applications of the respective archetype, and investments to ensure safety invariants at the ecosystem level amortize across many applications. Applying secure-by-design principles to developer ecosystems at Google has achieved drastic reduction and in some cases near-zero residual rates of common classes of defects, across hundreds of applications being developed by thousands of developers. View details
    Preview abstract In the present computerized period, information driven navigation is essential for the progress of cooperative work areas. This paper gives an extensive examination of how information designing, distributed storage, and business insight synergistically engage groups. We look at the basic standards of information designing, zeroing in on the plan, development, and the management of adaptable information pipelines. The job of distributed storage is investigated, featuring its ability to give adaptable, secure, and open information arrangements. Besides, we dive into business knowledge instruments and their capacity to change crude information into significant experiences. Through contextual analyses and exact information, we delineate the groundbreaking effect of these advances in group efficiency, coordinated effort, and dynamic cycles. This examination highlights the significance of incorporating hearty information designing works on, utilizing distributed storage arrangements, and utilizing complex business knowledge apparatuses to establish information engaged cooperative conditions. View details
    Preview abstract The latent space of diffusion model mostly still remains unexplored, despite its great success and potential in the field of generative modeling. In fact, the latent space of existing diffusion models are entangled, with a distorted mapping from its latent space to image space. To tackle this problem, we present Isometric Diffusion, equipping a diffusion model with a geometric regularizer to guide the model to learn a geometrically sound latent space. Our approach allows diffusion models to learn a more disentangled latent space, which enables smoother interpolation, more accurate inversion, and more precise control over attributes directly in the latent space. Extensive experiments illustrate advantages of the proposed method in image interpolation, image inversion, and linear editing. View details
    Preview abstract In this survey, we summarize recent developments in research fueled by the growing adoption of automated bidding strategies in online advertising. We explore the challenges and opportunities that have arisen as markets embrace this autobidding and cover a range of topics in this area, including bidding algorithms, equilibrium analysis and efficiency of common auction formats, and optimal auction design. View details
    Preview abstract Millions of people turn to Google Search each day for information on things as diverse as new cars or flu symptoms. The terms that they enter contain valuable information on their daily intent and activities, but the information in these search terms has been difficult to fully leverage. User-defined categorical filters have been the most common way to shrink the dimensionality of search data to a tractable size for analysis and modeling. In this paper we present a new approach to reducing the dimensionality of search data while retaining much of the information in the individual terms without user-defined rules. Our contributions are two-fold: 1) we introduce SLaM Compression, a way to quantify search terms using pre-trained language models and create a representation of search data that has low dimensionality, is memory efficient, and effectively acts as a summary of search, and 2) we present CoSMo, a Constrained Search Model for estimating real world events using only search data. We demonstrate the efficacy of our contributions by estimating with high accuracy U.S. automobile sales and U.S. flu rates using only Google Search data. View details
    Scalable Learning of Segment-Level Traffic Congestion Functions
    Shushman Choudhury
    Aboudy Kreidieh
    Alexandre Bayen
    IEEE Intelligent Transportation Systems Conference (2024)
    Preview abstract We propose and study a data-driven framework for identifying traffic congestion functions (numerical relationships between observations of traffic variables) at global scale and segment-level granularity. In contrast to methods that estimate a separate set of parameters for each roadway, ours learns a single black-box function over all roadways in a metropolitan area. First, we pool traffic data from all segments into one dataset, combining static attributes with dynamic time-dependent features. Second, we train a feed-forward neural network on this dataset, which we can then use on any segment in the area. We evaluate how well our framework identifies congestion functions on observed segments and how it generalizes to unobserved segments and predicts segment attributes on a large dataset covering multiple cities worldwide. For identification error on observed segments, our single data-driven congestion function compares favorably to segment-specific model-based functions on highway roads, but has room to improve on arterial roads. For generalization, our approach shows strong performance across cities and road types: both on unobserved segments in the same city and on zero-shot transfer learning between cities. Finally, for predicting segment attributes, we find that our approach can approximate critical densities for individual segments using their static properties. View details
    Preview abstract Inter-sentence pauses are the silences that occur between sentences in a paragraph or a dialogue. They are an important aspect of long-form speech prosody, as they can affect the naturalness, intelligibility, and effectiveness of communication. However, the user perception of inter-sentence pauses in long-form speech synthesis is not well understood. Previous work often evaluates pause modelling in conjunction with other prosodic features making it hard to explicitly study how raters perceive differences in inter-sentence pause lengths. In this paper, using multiple text-to-speech (TTS) datasets that cover different content types, domains, and settings, we investigate how sensitive raters are to changes to the durations of inter-sentence pauses in long-form speech by comparing ground truth audio samples with renditions that have manipulated pause durations. This experimental design is meant to allow us to draw conclusions regarding the utility that can be expected from similar evaluations when applied to synthesized long-form speech. We find that, using standard evaluation methodologies, raters are not sensitive to variations in pause lengths unless these deviate exceedingly from the norms or expectations of the speech context. View details
    Preview abstract This is the seventh installment of the Developer Productivity for Humans column. This installment focuses on software quality: what it means, how developers see it, how we break it down into 4 types of quality, and the impact these have on each other. View details
    Beyond SOT: Tracking Multiple Generic Objects at Once
    Christoph Mayer
    Martin Danelljan
    Vittorio Ferrari
    Luc Van Gool
    WACV'24 (2024)
    Preview abstract Generic Object Tracking (GOT) is the problem of tracking target objects, specified by bounding boxes in the first frame of a video. While the task has received much attention in the last decades, researchers have almost exclusively focused on the single object setting. However multiobject GOT poses its own challenges and is more attractive in real-world applications. We attribute the lack of research interest into this problem to the absence of suitable benchmarks. In this work, we introduce a new largescale GOT benchmark, LaGOT, containing multiple annotated target objects per sequence. Our benchmark allows users to tackle key remaining challenges in GOT, aiming to increase robustness and reduce computation through joint tracking of multiple objects simultaneously. In addition, we propose a transformer-based GOT tracker baseline capable of joint processing of multiple objects through shared computation. Our approach achieves a 4× faster run-time in case of 10 concurrent objects compared to tracking each object independently and outperforms existing single object trackers on our new benchmark. In addition, our approach achieves highly competitive results on single-object GOT datasets, setting a new state of the art on TrackingNet with a success rate AUC of 84.4%. Our benchmark, code, results and trained models are available at https://github.com/visionml/pytracking. View details
    Preview abstract Probabilistic forecasting is crucial to decision-making under uncertainty about future weather. The dominant approach is to use an ensemble of forecasts to represent and quantify uncertainty in operational numerical weather prediction. However, generating ensembles is computationally costly. In this paper, we propose to generate ensemble forecasts at scale by leveraging recent advances in generative artificial intelligence. Our approach learns a data-driven probabilistic diffusion model from the 5-member ensemble GEFS reforecast dataset. The model can then be sampled efficiently to produce realistic weather forecasts, conditioned on a few members of the operational GEFS forecasting system. The generated ensembles have similar predictive skill as the full GEFS 31-member ensemble, evaluated against ERA5 reanalysis, and emulate well the statistics of large physics-based ensembles. We also apply the same methodology to developing a diffusion model for generative post-processing: the model directly learns to correct biases present in the emulated forecasting system by leveraging reanalysis data as labels during training. Ensembles from this generative post-processing model show greater reliability and accuracy, particularly in extreme event classification. In general, they are more reliable and forecast the probability of extreme weather more accurately than the GEFS operational ensemble. Our models achieve these results at less than 1/10th of the computational cost incurred by the operational GEFS system. View details
    Streamlining Workload Management in AI-Driven Cloud Architectures: A Comparative Algorithmic Approach
    Pravallika Mannem
    Kiran Kumar Patibandla
    International Research Journal of Engineering and Technology, 11 (2024), pp. 113-121
    Preview abstract The use of artificial intelligence (AI) in cloud architectures has significantly increased processing efficiency and scale. However, with the development of complex algorithms and big data as well as surprisingly entered into our machine learning world; workload management becomes a significant issue in AI cloud computing. Existing workload management solutions are rule-based heuristics that may result in underutilization of resources and poor performance. For that, we present an algorithmic comparative approach to easing the burden of workload management for AI-driven cloud architectures. This is in contrast to executing a batch of tasks with different algorithms and comparing performance, cost, etc. We use ML methods to determine the best algorithm for our workload, and then deploy this in a self-contained binary that can switch between algorithms at runtime on an available resource. We validated our scheme with simulations, which demonstrates the capability of superior resource use and diminished completion time in comparison to rule-based schemes. When needed, flexibility and scalability allow you easier control over workloads that are subject to change or allocation. By simplifying AI-driven cloud workload management, the elasticity of their overall approach greatly enhances efficiency and scalability for those organizations looking to run even larger and take advantage of more complex workloads faster Tweet this Share on Facebook. View details