Jump to Content

Algorithms and Theory

Google’s mission presents many exciting algorithmic and optimization challenges across different product areas including Search, Ads, Social, and Google Infrastructure. These include optimizing internal systems such as scheduling the machines that power the numerous computations done each day, as well as optimizations that affect core products and users, from online allocation of ads to page-views to automatic management of ad campaigns, and from clustering large-scale graphs to finding best paths in transportation networks. Other than employing new algorithmic ideas to impact millions of users, Google researchers contribute to the state-of-the-art research in these areas by publishing in top conferences and journals.

Recent Publications

First Passage Percolation with Queried Hints
Kritkorn Karntikoon
Aaron Schild
Yiheng Shen
Ali Sinop
AISTATS (2024)
Preview abstract Optimization problems are ubiquitous throughout the modern world. In many of these applications, the input is inherently noisy and it is expensive to probe all of the noise in the input before solving the relevant optimization problem. In this work, we study how much of that noise needs to be queried in order to obtain an approximately optimal solution to the relevant problem. We focus on the shortest path problem in graphs, where one may think of the noise as coming from real-time traffic. We consider the following model: start with a weighted base graph $G$ and multiply each edge weight by an independently chosen, uniformly random number in $[1,2]$ to obtain a random graph $G'$. This model is called \emph{first passage percolation}. Mathematicians have studied this model extensively when $G$ is a $d$-dimensional grid graph, but the behavior of shortest paths in this model is still poorly understood in general graphs. We make progress in this direction for a class of graphs that resembles real-world road networks. Specifically, we prove that if the geometric realization of $G$ has constant doubling dimension, then for a given $s-t$ pair, we only need to probe the weights on $((\log n) / \epsilon)^{O(1)}$ edges in $G'$ in order to obtain a $(1 + \epsilon)$-approximation to the $s-t$ distance in $G'$. We also demonstrate experimentally that this result is pessimistic -- one can even obtain a short path in $G'$ with a small number of probes to $G'$. View details
Preview abstract Algorithms for the computation of alternative routes in road networks power many geographic navigation systems. A good set of alternative routes offers meaningful options to the user of the system and can support applications such as routing that is robust to failures (e.g., road closures, extreme traffic congestion, etc.) and routing with diverse preferences and objective functions. Algorithmic techniques for alternative route computation include the penalty method, via-node type algorithms (which deploy bidirectional search and finding plateaus), and, more recently, electrical-circuit based algorithms. In this work we focus on the practically important family of via-node type algorithms and we aim to produce high quality alternative routes for road netowrks. We study alternative route computation in the presence of a fast routing infrastructure that relies on hierarchical routing (namely, CRP). We propose new approaches that rely on deep learning methods. Our training methodology utilizes the hierarchical partition of the graph and builds models to predict which boundary road segments in the partition should be crossed by the alternative routes. We describe our methods in detail and evaluate them against the previously studied architectures, as well as against a stronger baseline that we define in this work, showing improvements in quality in the road networks of Seattle, Paris, and Bangalore. View details
Preview abstract Effective model calibration is a critical and indispensable component in developing Media Mix Models (MMMs). One advantage of Bayesian-based MMMs lies in their capacity to accommodate the information from experiment results and the modelers' domain knowledge about the ad effectiveness by setting priors for the model parameters. However, it remains ambiguous about how and which Bayesian priors should be tuned for calibration purpose. In this paper, we propose a new calibration method through model reparameterization. The reparameterized model includes Return on Ads Spend (ROAS) as a model parameter, enabling straightforward adjustment of its prior distribution to align with either experiment results or the modeler's prior knowledge. The proposed method also helps address several key challenges regarding combining MMMs and incrementality experiments. We use simulations to demonstrate that our approach can significantly reduce the bias and uncertainty in the resultant posterior ROAS estimates. View details
Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding
Alizée Pace
Hugo Yèche
Bernhard Schölkopf
Gunnar Rätsch
The Twelfth International Conference on Learning Representations (2024)
Preview abstract A prominent challenge of offline reinforcement learning (RL) is the issue of hidden confounding. There, unobserved variables may influence both the actions taken by the agent and the outcomes observed in the data. Hidden confounding can compromise the validity of any causal conclusion drawn from the data and presents a major obstacle to effective offline RL. In this paper, we tackle the problem of hidden confounding in the nonidentifiable setting. We propose a definition of uncertainty due to confounding bias, termed delphic uncertainty, which uses variation over compatible world models, and differentiate it from the well known epistemic and aleatoric uncertainties. We derive a practical method for estimating the three types of uncertainties, and construct a pessimistic offline RL algorithm to account for them. Our method does not assume identifiability of the unobserved confounders, and attempts to reduce the amount of confounding bias. We demonstrate through extensive experiments and ablations the efficacy of our approach on a sepsis management benchmark, as well as real electronic health records. Our results suggest that nonidentifiable confounding bias can be addressed in practice to improve offline RL solutions. View details
Data Exchange Markets via Utility Balancing
Aditya Bhaskara
Sungjin Im
Kamesh Munagala
Govind S. Sankar
WebConf (2024)
Preview abstract This paper explores the design of a balanced data-sharing marketplace for entities with heterogeneous datasets and machine learning models that they seek to refine using data from other agents. The goal of the marketplace is to encourage participation for data sharing in the presence of such heterogeneity. Our market design approach for data sharing focuses on interim utility balance, where participants contribute and receive equitable utility from refinement of their models. We present such a market model for which we study computational complexity, solution existence, and approximation algorithms for welfare maximization and core stability. We finally support our theoretical insights with simulations on a mean estimation task inspired by road traffic delay estimation. View details
Network Flow Problems with Electric Vehicles
Haripriya Pulyassary
Aaron Schild
David Shmoys
Manxi Wu
IPCO (2024)
Preview abstract Electric vehicle (EV) adoption in long-distance logistics faces challenges like range anxiety and uneven distribution of charging stations. Two pivotal questions emerge: How can EVs be efficiently routed in a charging network considering range limits, charging speeds and prices And, can the existing charging infrastructure sustain the increasing demand for EVs in long-distance logistics? This paper addresses these questions by introducing a novel theoretical and computational framework to study the EV network flow problems. We present an EV network flow model that incorporates range restrictions and nonlinear charging rates, and identify conditions under which polynomial-time solutions can be obtained for optimal single EV routing, maximum flow, and minimum cost flow problems. We develop efficient computational methods for computing the optimal routing and flow vector using a novel graph augmentation technique. Our findings provide insights for optimizing EV routing in logistics, ensuring an efficient and sustainable future. View details