Jump to Content
Sreenivas Gollapudi

Sreenivas Gollapudi

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract Many geographic information systems applications rely on the data provided by user devices in the road network. Such applications include traffic monitoring, driving navigation, detecting road closures or the construction of new roads, etc. This signal is collected by sampling locations from the user trajectories and is a critical process for all such systems. Yet, it has not been sufficiently studied in the literature. The most natural way to sample a trajectory is perhaps using a frequency based algorithm, e.g., sample every $x$ seconds. However, as we argue in this paper, such a simple strategy can be very wasteful in terms of resources (e.g., server-side processing, user battery) and in terms of the amount of user data that it maintains. In this work we conduct a horizontal study of various location sampling algorithms (including frequency-based, road geography-based, reservoir-sampling based, etc.) and extract their trade-offs in terms of various metrics of interest, such as, the size of the stored data and the induced quality of training for prediction tasks (e.g., predicting speeds) using the road network of New York City. View details
    Preview abstract Algorithms for the computation of alternative routes in road networks power many geographic navigation systems. A good set of alternative routes offers meaningful options to the user of the system and can support applications such as routing that is robust to failures (e.g., road closures, extreme traffic congestion, etc.) and routing with diverse preferences and objective functions. Algorithmic techniques for alternative route computation include the penalty method, via-node type algorithms (which deploy bidirectional search and finding plateaus), and, more recently, electrical-circuit based algorithms. In this work we focus on the practically important family of via-node type algorithms and we aim to produce high quality alternative routes for road netowrks. We study alternative route computation in the presence of a fast routing infrastructure that relies on hierarchical routing (namely, CRP). We propose new approaches that rely on deep learning methods. Our training methodology utilizes the hierarchical partition of the graph and builds models to predict which boundary road segments in the partition should be crossed by the alternative routes. We describe our methods in detail and evaluate them against the previously studied architectures, as well as against a stronger baseline that we define in this work, showing improvements in quality in the road networks of Seattle, Paris, and Bangalore. View details
    Preview abstract Historically, much of machine learning research has focused on the performance of the algorithm alone, but recently more attention has been focused on optimizing joint human-algorithm performance. Here, we analyze a specific type of human-algorithm collaboration where the algorithm has access to a set of $n$ items, and presents a subset of size $k$ to the human, who selects a final item from among those $k$. This scenario could model content recommendation, route planning, or any type of labeling task. Because both the human and algorithm have imperfect, noisy information about the true ordering of items, the key question is: which value of $k$ maximizes the probability that the best item will be ultimately selected? For $k=1$, performance is optimized by the algorithm acting alone, and for $k=n$ it is optimized by the human acting alone. Surprisingly, we show that for multiple of noise models, it is optimal to set $k \in [2, n-1]$ - that is, there are strict benefits to collaborating, even when the human and algorithm have equal accuracy separately. We demonstrate this theoretically for the Mallows model and experimentally for the Random Utilities models of noisy permutations. However, we show this pattern is \emph{reversed} when the human is anchored on the algorithm's presented ordering - the joint system always has strictly worse performance. We extend these results to the case where the human and algorithm differ in their accuracy levels, showing that there always exist regimes where a more accurate agent would strictly benefit from collaborating with a less accurate one, but these regimes are asymmetric between the human and the algorithm's accuracy. View details
    Preview abstract Online navigation platforms are well optimized to solve for the standard objective of minimizing the travel time and typically require precomputation-based architectures (such as Contraction Hierarchies and the Customizable Route Planning) to do so in a fast manner. The reason for this dependence is the size of the graph that represents the road network, which is large. The need to go beyond minimizing the travel time and introduce various types of customizations has led to approaches that rely on alternative route computation or, more generally, small subgraph extraction. On a small subgraph, one is able to run computationally expensive algorithms at query time and compute optimal solutions for multiple routing problems. In this framework, it is critical for the subgraph to a) be small and b) include (near) optimal routes for a collection of customizations. This is precisely the setting that we study in this work. We design algorithms that extract a subgraph connecting designated terminals with the objective to minimize the subgraph's size and the constraint to include near optimal routes for a set of predefined cost functions. We provide theoretical guarantees for our algorithms and evaluate them empirically using real world road networks. View details
    Data Exchange Markets via Utility Balancing
    Aditya Bhaskara
    Sungjin Im
    Kamesh Munagala
    Govind S. Sankar
    WebConf (2024)
    Preview abstract This paper explores the design of a balanced data-sharing marketplace for entities with heterogeneous datasets and machine learning models that they seek to refine using data from other agents. The goal of the marketplace is to encourage participation for data sharing in the presence of such heterogeneity. Our market design approach for data sharing focuses on interim utility balance, where participants contribute and receive equitable utility from refinement of their models. We present such a market model for which we study computational complexity, solution existence, and approximation algorithms for welfare maximization and core stability. We finally support our theoretical insights with simulations on a mean estimation task inspired by road traffic delay estimation. View details
    First Passage Percolation with Queried Hints
    Kritkorn Karntikoon
    Aaron Schild
    Yiheng Shen
    Ali Sinop
    AISTATS (2024)
    Preview abstract Optimization problems are ubiquitous throughout the modern world. In many of these applications, the input is inherently noisy and it is expensive to probe all of the noise in the input before solving the relevant optimization problem. In this work, we study how much of that noise needs to be queried in order to obtain an approximately optimal solution to the relevant problem. We focus on the shortest path problem in graphs, where one may think of the noise as coming from real-time traffic. We consider the following model: start with a weighted base graph $G$ and multiply each edge weight by an independently chosen, uniformly random number in $[1,2]$ to obtain a random graph $G'$. This model is called \emph{first passage percolation}. Mathematicians have studied this model extensively when $G$ is a $d$-dimensional grid graph, but the behavior of shortest paths in this model is still poorly understood in general graphs. We make progress in this direction for a class of graphs that resembles real-world road networks. Specifically, we prove that if the geometric realization of $G$ has constant doubling dimension, then for a given $s-t$ pair, we only need to probe the weights on $((\log n) / \epsilon)^{O(1)}$ edges in $G'$ in order to obtain a $(1 + \epsilon)$-approximation to the $s-t$ distance in $G'$. We also demonstrate experimentally that this result is pessimistic -- one can even obtain a short path in $G'$ with a small number of probes to $G'$. View details
    Preview abstract Graph Neural Networks (GNNs) have emerged as a powerful technique for learning on relational data. Owing to the relatively limited number of message passing steps they perform—and hence a smaller receptive field—there has been significant interest in improving their expressivity by incorporating structural aspects of the underlying graph. In this paper, we explore the use of affinity measures as features in graph neural networks, in particular measures arising from random walks, including effective resistance, hitting and commute times. We propose message passing networks based on these features and evaluate their performance on a variety of node and graph property prediction tasks. Our architecture has low computational complexity, while our features are invariant to the permutations of the underlying graph. The measures we compute allow the network to exploit the connectivity properties of the graph, thereby allowing us to outperform relevant benchmarks for a wide variety of tasks, often with significantly fewer message passing steps. On one of the largest publicly available graph regression datasets, OGB-LSC-PCQM4Mv1, we obtain the best known single-model validation MAE at the time of writing. View details
    Preview abstract We develop an online learning algorithm for a navigation platform to route travelers in a congested network with multiple origin-destination (o-d) pairs while simultaneously learning the unknown cost function of road segments (edges) using the crowd-sourced data. The number of travel requests is randomly realized, and the travel time of each edge is stochastically distributed with the mean being a linear function that increases with the edge load (the number of travelers who take the edge). In each step of our algorithm, the platform updates the estimates of cost function parameters using the collected travel time data, and maintains a rectangular confidence interval of each parameter. The platform then routes travelers in the next step using an optimistic strategy based on the lower bound of the up-to-date confidence interval. The key aspect of our setting is that the number of collected data on each edge depends on the number of travelers who use it, which creates the dependency between the spatial distribution of data points and the routing strategy. We prove that the regret upper bound of our algorithm is \(O(\sqrt{T}\log(T),|E|)\), where \(T\) is the time horizon, and \(|E|\) is the number of edges in the network. Furthermore, we show that the regret bound decreases as the number of travelers increases, which implies that the platform learns faster with a larger user population. Finally, we implement our algorithm on the network of New York City, and demonstrate the efficacy using the accumulated regret. View details
    Preview abstract We consider the classic stochastic multi-armed bandit (MAB) problem, when at each step, the online policy is given the extra power to probe (or observe) the rewards of $k$ arms before playing one of them. We derive new algorithms whose regret bounds in this model have exponentially better dependence on the time horizon when compared to the classic regret bounds. In particular, we show that $k=3$ probes suffice to achieve parameter-independent constant regret of $O(n^2)$, where $n$ is the number of arms. Such regret bounds cannot be achieved even with full feedback {\em after} the play (that is, in the experts model), showcasing the power of even a few probes {\em before} making the play. We present simulations that show benefit of our policies even on benign instances. View details
    Preview abstract For traffic routing platforms, the choice of which route to recommend to a user depends on the congestion on these routes -- indeed, an individual's utility depends on the number of people using the recommended route at that instance. Motivated by this, we introduce the problem of Congested Bandits where each arm's reward is allowed to depend on the number of times it was played in the past $\win$ timesteps. This dependence on past history of actions leads to a dynamical system where an algorithm's present choices also affect its future pay-offs, and requires an algorithm to plan for this. We study the congestion aware formulation in the multi-armed bandit (MAB) setup and in the contextual bandit setup with linear rewards. For the multi-armed setup, we propose a UCB style algorithm and show that its policy regret scales as $\tilde{O}(\sqrt{\K \win T})$. For the linear contextual bandit setup, our algorithm, based on an iterative least squares planner, achieves policy regret $\tilde{O}(\sqrt{dT} + \win)$. From an experimental standpoint, we corroborate the no-regret properties of our algorithms via a simulation study. View details
    Preview abstract Many online trip planning and navigation software need to routinely solve the problem of deciding where to take stops during a journey for various services such as refueling (or EV charging), rest stops, food, etc. The goal is to minimize the overhead of these stops while ensuring that the traveller is not starved of any essential resource (such as fuel, rest, or food) during the journey. In this paper, we formally model this problem and call it the {\em pit stop} problem. We design algorithms for this problem under various settings: single vs multiple types of stops, and offline vs online optimization (i.e., in advance of or during the trip). Our algorithms achieve provable guarantees in terms of approximating the optimal solution. We then extensively evaluate our algorithms on real world data and demonstrate that they significantly outperform baseline solutions. View details
    Preview abstract We study the problem faced by an online navigation platform that wishes to improve the total cost experienced by drivers in the road network, i.e., minimize the total travel time of vehicles on the streets. The platform has access to a fixed fraction of the traffic and needs to route it in a manner that improves the overall conditions in the network, both for platform users and non-users. This setting has been studied in the context of designing {\em Stackelberg strategies} for selfish routing games. An additional consideration for a routing platform in this setting is that it should ensure a positive experience for its users. The reasons for this are twofold: (a) the platform has a customer-service provider relationship with its users and (b) the users will opt out of following the platform's recommendations if they are systematically poor and the overall routing optimization effort will fail. This aspect is not explicitly addressed in standard Stackelberg algorithms and, in particular, some of them (e.g., Largest Latency First) move in the opposite direction of assigning user traffic to the most expensive paths so that the (selfish) non-user traffic can utilize the better part of the network. To address this challenge we formulate a weighted version of the Stackelberg routing problem in which the delay experienced by non-users of the platform is discounted by some parameter $\beta < 1$. We study natural algorithms for this problem and provide provable guarantees in the form of constant approximation ratios for various settings. In simulations with real graphs, data-induced delay functions, and realistic demands, we exhibit that such strategies improve the experience of both platform users and independent traffic in the road network and extract the trade-off between providing high quality service to the platform's users and improving the overall total cost. View details
    Preview abstract Generating alternative routes in road networks is an application of significant interest for online navigation systems. A high quality set of diverse alternate routes offers two functionalities - a) support support multiple (unknown) preferences that the user may have; and b) robust to changes in network conditions. We address the latter in this paper. The main techniques that produce alternative routes in road networks are the {\em penalty} and the {\em plateau} methods, with the former providing high quality results but being too slow for practical use and the latter being fast but suffering in terms of quality. In this work we propose a novel method to produce alternative routes that is fundamentally different from the aforementioned approaches. Our algorithm borrows concepts from electrical flows and their decompositions. We provide an analysis that proves theoretical properties of our algorithm and evaluate it against the penalty and plateau methods, showing that it is as fast as the plateau method while also recovering much of the headroom towards the quality of the penalty method. The metrics we use to evaluate performance include the {\em stretch} (the average cost of the routes), the {\em diversity}, and the {\em robustness} (the connectivity between the origin and destination) of the induced set of routes. View details
    Preview abstract We consider the following variant of contextual linear bandits motivated by routing applications in navigational engines and recommendation systems. We wish to learn a hidden $d$-dimensional value $w^*$. Every round, we are presented with a subset $\mathcal{X}_t \subseteq \mathbb{R}^d$ of possible actions. If we choose (i.e. recommend to the user) action $x_t$, we obtain utility $\langle x_t, w^* \rangle$ but only learn the identity of the best action $\arg\max_{x \in \X_t} \langle x, w^* \rangle$. We design algorithms for this problem which achieve regret $O(d\log T)$ and $\exp(O(d \log d))$. To accomplish this, we design novel cutting-plane algorithms with low “regret” -- the total distance between the true point $w^*$ and the hyperplanes the separation oracle returns. We also consider the variant where we are allowed to provide a list of several recommendations. In this variant, we give an algorithm with $O(d^2 \log d)$ regret and list size $\poly(d)$. Finally, we construct nearly tight algorithms for a weaker variant of this problem where the learner only learns the identity of an action that is better than the recommendation. Our results rely on new algorithmic techniques in convex geometry (including a variant of Steiner’s formula for the centroid of a convex set) which may be of independent interest. View details
    Preview abstract Motivated by the emergence of popular service-based two-sided markets where sellers can serve multiple buyers at the same time, we formulate and study the two-sided cost sharing problem. In two-sided cost sharing, sellers incur different costs for serving different subsets of buyers and buyers have different values for being served by different sellers. Both buyers and sellers are self-interested agents whose values and costs are private information. We study the problem from the perspective of an intermediary platform that matches buyers to sellers and assigns prices and wages in an effort to maximize gains from trade (i.e., buyer values minus seller costs) subject to budget-balance in an incentive compatible manner. In our markets of interest, agents trade the (often same) services multiple times. Moreover, the value and cost for the same service differs based on the context (e.g., location, urgency, weather conditions, etc). In this framework, we design mechanisms that are efficient, ex-ante budget-balanced, ex-ante individually rational, dominant strategy incentive compatible, and ex-ante in the core (a natural generalization of the core that we define here). View details
    Preview abstract A two-sided market consists of two sets of agents, each of whom have preferences over the other (Airbnb, Upwork, Lyft, Uber, etc.). We propose and analyze a repeated matching problem, where each time step consists of a single match, and our goal is to ensure fairness with respect to the cumulative allocations over an infinite time horizon. We prove several positive results, our main one being that for additive, symmetric ($v_i(j) = v_j(i)$), and binary ($v_i(j) \in \{a,1\}$) valuations, we can simultaneously (1) guarantee \emph{envy-freeness up to a single match} (EF1) and (2) select a maximum weight matching for each ``stage". Thus for this class of valuations, fairness can be achieved without sacrificing efficiency. This result holds even for \emph{dynamic valuations}, i.e., valuations that change over time. Although symmetry is a strong assumption, we show that this result cannot be extended to asymmetric binary valuations: (1) and (2) together are impossible even when valuations do not change over time, and for dynamic valuations, even (1) alone is impossible. To our knowledge, this is the first analysis of almost envy-freeness for two-sided preferences. View details
    Preview abstract Inspired by traffic routing applications, we consider the problem of finding the shortest path from a source $s$ to a destination $t$ in a graph, when the lengths of the edges are unknown. Instead, we are given {\em hints} or predictions of the edge lengths from a collection of ML models, trained possibly on historical data and other contexts in the network. Additionally, we assume that the true length of any candidate path can be obtained by {\em probing} an up-to-date snapshot of the network. However, each probe introduces a latency, and thus the goal is to minimize the number of probes while finding a near-optimal path with high probability. We formalize this problem and show assumptions under which it admits to efficient approximation algorithms. We verify these assumptions and validate the performance of our algorithms on real data. View details
    Preview abstract In recent years, a range of online applications have facilitated resource sharing among users, resulting in a significant increase in resource utilization. In all such applications, sharing one's resources or skills with other agents increases social welfare. In general, each agent will look for other agents whose available resources complement hers, thereby forming natural sharing groups. In this paper, we study settings where a large population self-organizes into sharing groups. In many cases, centralized optimization approaches for creating an optimal partition of the user population are infeasible because either the central authority does not have the necessary information to compute an optimal partition, or it does not have the power to enforce a partition. Instead, the central authority puts in place an incentive structure in the form of a utility sharing method, before letting the participants form the sharing groups by themselves. We first analyze a simple equal-sharing method, which is the one most typically encountered in practice and show that it can lead to highly inefficient equilibria. We then propose a Shapley-sharing method and show that it significantly improves overall social welfare. View details
    Preview abstract In this paper we introduce the hiring under uncertainty problem to model the questions faced by hiring committees in large enterprises and universities alike. Given a set of $n$ eligible candidates, the decision maker needs to choose the sequence of candidates to make offers so as to hire the $k$ best candidates. However, candidates may choose to reject an offer (for instance, due to a competing offer) and the decision maker has a time limit by which all positions must be filled. Given an estimate of the probabilities of acceptance for each candidate, the hiring under uncertainty problem is to design a strategy of making offers so that the total expected value of all candidates hired by the time limit is maximized. View details
    Preview abstract A core tension in the operations of online marketplaces is between segmentation (wherein platforms can increase revenue by segmenting the market into ever smaller sub-markets) and thickness (wherein the size of the sub-market affects the utility experienced by an agent). An important example of this is in dynamic online marketplaces, where buyers and sellers, in addition to preferences for different matches, also have finite patience (or deadlines) for being matched. We formalize this trade-off via a novel optimization problem that we term as 'Two-sided Facility Location': we consider a market wherein agents arrive at nodes embedded in an underlying metric space, where the distance between a buyer and seller captures the quality of the corresponding match. The platform posts prices and wages at the nodes, and opens a set of virtual clearinghouses where agents are routed for matching. To ensure high match-quality, the platform imposes a distance constraint between an agent and its clearinghouse; to ensure thickness, the platform requires the flow to any clearinghouse be at least a pre-specified lower bound. Subject to these constraints, the goal of the platform is to maximize the social surplus subject to weak budget balance, i.e., profit being non-negative. Our work characterizes the complexity of this problem by providing both hardness results as well as algorithms for this setting; in particular, we present an algorithm that yields a $(1 + \epsilon)$ approximation for the surplus for any constant $\epsilon > 0$, while relaxing the match quality (i.e., maximum distance of any match) by a constant factor. View details
    Preview abstract We study the problem of automatically and efficiently generating itineraries for users who are on vacation. We focus on the common case, wherein the trip duration is more than a single day. Previous efficient algorithms based on greedy heuristics suffer from two problems. First, the itineraries are often unbalanced, with excellent days visiting top attractions followed by days of exclusively lower-quality alternatives. Second, the trips often re-visit neighborhoods repeatedly in order to cover increasingly low-tier points of interest. Our primary technical contribution is an algorithm that addresses both these problems by maximizing the quality of the worst day. We give theoretical results showing that this algorithm's competitive factor is within a factor two of the guarantee of the best available algorithm for a single day, across many variations of the problem. We also give detailed empirical evaluations using two distinct datasets: (a) anonymized Google historical visit data and (b) Foursquare public check-in data. We show first that the overall utility of our itineraries is almost identical to that of algorithms specifically designed to maximize total utility, while the utility of the worst day of our itineraries is roughly twice that obtained from other approaches. We then turn to evaluation based on human raters who score our itineraries only slightly below the itineraries created by human travel experts with deep knowledge of the area. View details
    Minimizing Latency in Online Ride and Delivery Services
    Abhimanyu Das
    Anthony Kim
    Chaitanya Swamy
    Debmalya Panigrahi
    WWW 2018 (2018)
    Preview abstract Motivated by the popularity of online ride services such as Uber, Lyft, and Ola, we study natural variants of classical multi-vehicle minimum latency problems where the objective is to route a set of vehicles located at depots to serve request located on a metric space so as to minimize the total latency. In this paper, we consider point-to-point requests that come with source-destination pairs and release-time constraints that restrict when each request can be served. The point-to-point requests and release-time constraints model taxi rides and deliveries. For all the variants considered, we show constant-factor approximation algorithms based on a linear programming framework. To the best of our knowledge, these are the first set of results for the aforementioned variants of the minimum latency problems. Furthermore, we provide an empirical study of heuristics based on our theoretical algorithms on a real data set of taxi rides. View details
    Preview abstract We consider the problem of approximating a given matrix by a low-rank matrix so as to minimize the entrywise ℓp-approximation error, for any p≥1; the case p=2 is the classical SVD problem. We obtain the first provably good approximation algorithms for this version of low-rank approximation that work for every value of p≥1, including p=∞. Our algorithms are simple, easy to implement, work well in practice, and illustrate interesting tradeoffs between the approximation quality, the running time, and the rank of the approximating matrix. View details
    Segmenting Two-Sided Markets
    Siddhartha Banerjee
    Kamesh Munagala
    WWW (2017)
    Preview abstract Recent years have witnessed the rise of many successful ecommerce marketplaces like the Amazon marketplace, Uber, AirBnB, and Upwork, where a central platform mediates economic transactions between buyers and sellers. A common feature of many of these two-sided marketplaces is that the platform has full control over search and discovery, but prices are determined by the buyers and sellers. Motivated by this, we study the algorithmic aspects of market segmentation via directed discovery in two-sided markets with endogenous prices. We consider a model where an online platform knows each buyer/seller’s characteristics, and associated demand/supply elasticities. Moreover, the platform can use discovery mechanisms (search/recommendation/etc.) to control which buyers/sellers are visible to each other. This leads to a segmentation of the market into pools, following which buyers and sellers endogenously determine market clearing transaction prices within each pool. The aim of the platform is to maximize the resulting volume of transactions/welfare in the market. We develop efficient algorithms with provable guarantees under a variety of assumptions on the demand and supply functions. We also test the validity of our assumptions on demand curves inferred from NYC taxicab log-data, as well as show the performance of our algorithms on synthetic experiments. View details
    Partitioning Orders in Online Shopping Services
    Debmalya Panigrahi
    Conf. on Information and Knowledge Management (CIKM) (2017)
    Preview abstract The rapid growth of the sharing economy has led to the widespread use of newer and richer models of online shopping and delivery services. The race to deliver fast has transformed such services into complex networks of shoppers, stores, and consumers. Needless to say, the efficiency of the store order management is critical to the business. Motivated by this setting, we consider the following problem: given a set of online shopping orders each consisting of a few items, how to best partition the orders among a given number of pickers? Owing to logistical constraints the orders are typically unsplittable in the partition. This partitioning, taking the physical location of the items in the store , has to optimize the utilization and amount of work done by the shoppers in the store. Formulating this as a combinatorial optimization problem, we propose a family of simple and efficient algorithms that admit natural constraints arising in this setting. In addition to showing provable guarantees for the algorithms, we also demonstrate their efficiency in practice on real-world data from Google Express [1], outperforming natural baselines. View details
    Preview abstract We study utility games (Vetta, FOCS 2002) where a set of players join teams to produce social utility, and receive individual utility in the form of payments in return. These games have many natural applications in competitive settings such as labor markets, crowdsourcing, etc. The efficiency of such a game depends on the profit sharing mechanism -- the rule that maps utility produced by the players to their individual payments. We study three natural and widely used profit sharing mechanisms -- egalitarian or equal sharing, marginal gain or value addition when a player joins, and marginal loss or value depletion when a player leaves. For these settings, we give tight bounds on the price of anarchy, thereby allowing comparison between these popular mechanisms from a (worst case) social welfare perspective. View details
    No Results Found