# Kostas Kollias

### Research Areas

Authored Publications

Google Publications

Other Publications

Sort By

Data Exchange Markets via Utility Balancing

Aditya Bhaskara

Sungjin Im

Kamesh Munagala

Govind S. Sankar

WebConf (2024)

Preview abstract
This paper explores the design of a balanced data-sharing marketplace for entities with heterogeneous datasets and machine learning models that they seek to refine using data from other agents. The goal of the marketplace is to encourage participation for data sharing in the presence of such heterogeneity. Our market design approach for data sharing focuses on interim utility balance, where participants contribute and receive equitable utility from refinement of their models. We present such a market model for which we study computational complexity, solution existence, and approximation algorithms for welfare maximization and core stability. We finally support our theoretical insights with simulations on a mean estimation task inspired by road traffic delay estimation.
View details

Efficient Location Sampling Algorithms for Road Networks

Vivek Kumar

Santhoshini Velusamy

WebConf (2024)

Preview abstract
Many geographic information systems applications rely on the data provided by user devices in the road network. Such applications include traffic monitoring, driving navigation, detecting road closures or the construction of new roads, etc. This signal is collected by sampling locations from the user trajectories and is a critical process for all such systems. Yet, it has not been sufficiently studied in the literature. The most natural way to sample a trajectory is perhaps using a frequency based algorithm, e.g., sample every $x$ seconds. However, as we argue in this paper, such a simple strategy can be very wasteful in terms of resources (e.g., server-side processing, user battery) and in terms of the amount of user data that it maintains. In this work we conduct a horizontal study of various location sampling algorithms (including frequency-based, road geography-based, reservoir-sampling based, etc.) and extract their trade-offs in terms of various metrics of interest, such as, the size of the stored data and the induced quality of training for prediction tasks (e.g., predicting speeds) using the road network of New York City.
View details

First Passage Percolation with Queried Hints

Kritkorn Karntikoon

Aaron Schild

Yiheng Shen

Ali Sinop

AISTATS (2024)

Preview abstract
Optimization problems are ubiquitous throughout the modern world. In many of these applications, the input is inherently noisy and it is expensive to probe all of the noise in the input before solving the relevant optimization problem. In this work, we study how much of that noise needs to be queried in order to obtain an approximately optimal solution to the relevant problem. We focus on the shortest path problem in graphs, where one may think of the noise as coming from real-time traffic. We consider the following model: start with a weighted base graph $G$ and multiply each edge weight by an independently chosen, uniformly random number in $[1,2]$ to obtain a random graph $G'$. This model is called \emph{first passage percolation}. Mathematicians have studied this model extensively when $G$ is a $d$-dimensional grid graph, but the behavior of shortest paths in this model is still poorly understood in general graphs. We make progress in this direction for a class of graphs that resembles real-world road networks. Specifically, we prove that if the geometric realization of $G$ has constant doubling dimension, then for a given $s-t$ pair, we only need to probe the weights on $((\log n) / \epsilon)^{O(1)}$ edges in $G'$ in order to obtain a $(1 + \epsilon)$-approximation to the $s-t$ distance in $G'$. We also demonstrate experimentally that this result is pessimistic -- one can even obtain a short path in $G'$ with a small number of probes to $G'$.
View details

Weighted Stackelberg Algorithms for Road Traffic Optimization

Arun Chandrashekharapuram

Lisa Fawcett

Ali Sinop

SIGSPATIAL (2021)

Preview abstract
We study the problem faced by an online navigation platform that wishes to improve the total cost experienced by drivers in the road network, i.e., minimize the total travel time of vehicles on the streets. The platform has access to a fixed fraction of the traffic and needs to route it in a manner that improves the overall conditions in the network, both for platform users and non-users. This setting has been studied in the context of designing {\em Stackelberg strategies} for selfish routing games. An additional consideration for a routing platform in this setting is that it should ensure a positive experience for its users. The reasons for this are twofold: (a) the platform has a customer-service provider relationship with its users and (b) the users will opt out of following the platform's recommendations if they are systematically poor and the overall routing optimization effort will fail. This aspect is not explicitly addressed in standard Stackelberg algorithms and, in particular, some of them (e.g., Largest Latency First) move in the opposite direction of assigning user traffic to the most expensive paths so that the (selfish) non-user traffic can utilize the better part of the network. To address this challenge we formulate a weighted version of the Stackelberg routing problem in which the delay experienced by non-users of the platform is discounted by some parameter $\beta < 1$. We study natural algorithms for this problem and provide provable guarantees in the form of constant approximation ratios for various settings. In simulations with real graphs, data-induced delay functions, and realistic demands, we exhibit that such strategies improve the experience of both platform users and independent traffic in the road network and extract the trade-off between providing high quality service to the platform's users and improving the overall total cost.
View details

Contextual Recommendations and Low-Regret Cutting-Plane Algorithms

Guru Prashanth Guruganesh

NeurIPS (2021)

Preview abstract
We consider the following variant of contextual linear bandits motivated by routing applications in navigational engines and recommendation systems. We wish to learn a hidden $d$-dimensional value $w^*$. Every round, we are presented with a subset $\mathcal{X}_t \subseteq \mathbb{R}^d$ of possible actions. If we choose (i.e. recommend to the user) action $x_t$, we obtain utility $\langle x_t, w^* \rangle$ but only learn the identity of the best action $\arg\max_{x \in \X_t} \langle x, w^* \rangle$.
We design algorithms for this problem which achieve regret $O(d\log T)$ and $\exp(O(d \log d))$. To accomplish this, we design novel cutting-plane algorithms with low “regret” -- the total distance between the true point $w^*$ and the hyperplanes the separation oracle returns.
We also consider the variant where we are allowed to provide a list of several recommendations. In this variant, we give an algorithm with $O(d^2 \log d)$ regret and list size $\poly(d)$. Finally, we construct nearly tight algorithms for a weaker variant of this problem where the learner only learns the identity of an action that is better than the recommendation. Our results rely on new algorithmic techniques in convex geometry (including a variant of Steiner’s formula for the centroid of a convex set) which may be of independent interest.
View details

Orienteering Algorithms for Generating Travel Itineraries

Zachary Friggstad

Chaitanya Swamy

WSDM (2018)

Preview abstract
We study the problem of automatically and efficiently generating itineraries
for users who are on vacation. We focus on the common case, wherein the trip
duration is more than a single day. Previous efficient algorithms based on
greedy heuristics suffer from two problems. First, the itineraries are often
unbalanced, with excellent days visiting top attractions followed by days of
exclusively lower-quality alternatives. Second, the trips often re-visit
neighborhoods repeatedly in order to cover increasingly low-tier points of
interest. Our primary technical contribution is an algorithm that addresses
both these problems by maximizing the quality of the worst day. We give
theoretical results showing that this algorithm's competitive factor is within
a factor two of the guarantee of the best available algorithm for a single day,
across many variations of the problem. We also give detailed empirical
evaluations using two distinct datasets: (a) anonymized Google historical visit
data and (b) Foursquare public check-in data. We show first that the overall
utility of our itineraries is almost identical to that of algorithms
specifically designed to maximize total utility, while the utility of the worst
day of our itineraries is roughly twice that obtained from other approaches.
We then turn to evaluation based on human raters who score our itineraries only
slightly below the itineraries created by human travel experts with deep
knowledge of the area.
View details

No Results Found