# Hossein Esfandiari

Authored Publications

Sort By

Optimal Fully Dynamic k-Center Clustering for Adaptive and Oblivious Adversaries

Preview
Monika Henzinger

Andreas Wiese

Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)

Tackling Provably Hard Representative Selection via Graph Neural Networks

Transactions on Machine Learning Research (2023)

Preview abstract
Representative Selection (RS) is the problem of finding a small subset of exemplars from a dataset that is representative of the dataset. In this paper, we study RS for attributed graphs, and focus on finding representative nodes that optimize the accuracy of a model trained on the selected representatives. Theoretically, we establish a new hardness result for RS (in the absence of a graph structure) by proving that a particular, highly practical variant of it (RS for Learning) is hard to approximate in polynomial time within any reasonable factor, which implies a significant potential gap between the optimum solution of widely-used surrogate functions and the actual accuracy of the model. We then study the setting where a (homophilous) graph structure is available, or can be constructed, between the data points. We show that with an appropriate modeling approach, the presence of such a structure can turn a hard RS (for learning) problem into one that can be effectively solved. To this end, we develop RS-GNN, a representation learning-based RS model based on Graph Neural Networks. Empirically, we demonstrate the effectiveness of RS-GNN on problems with predefined graph structures as well as problems with graphs induced from node feature similarities, by showing that RS-GNN achieves significant improvements over established baselines on a suite of eight benchmarks.
View details

Improved Approximations for Euclidean k-means and k-median, via Nested Quasi-Independent Sets

Shyam Narayanan

54rd Annual ACM Symposium on Theory of Computing (STOC'22) (2022)

Preview abstract
Motivated by data analysis and machine learning applications, we consider the popular high-dimensional Euclidean $k$-median and $k$-means problems.
We propose a new primal-dual algorithm, inspired by the classic algorithm of Jain and Vazirani and the recent algorithm of Ahmadian et al.. Our algorithm achieves
an approximation ratio of respectively 2.40... and 5.95... for Euclidean $k$-median and $k$-means improving upon the
2.60... of Ahmadian et al. and the 6.12.. of Grandoni et al..
View details

Prophets, Secretaries, and Maximizing the Probability of Choosing the Best

Brendan Lucier

Michael Mitzenmacher

MohammadTaghi Hajiaghayi

AISTATS (2020) (to appear)

Preview abstract
Suppose a customer is faced with a sequence of fluctuating prices, such as for airfare or a product sold by a large online retailer. Given distributional information about what price they might face each day, how should they choose when to purchase in order to maximize the likelihood of getting the best price in retrospect? This is related to the classical secretary problem, but with values drawn from known distributions. In their pioneering work, Gilbert and Mosteller [J. Amer. Statist. Assoc. 1966] showed that when the values are drawn i.i.d., there is a thresholding algorithm that selects the best value with probability approximately 0.5801. However, the more general problem with non-identical distributions has remained unsolved.
In this paper, we provide an algorithm for the case of non-identical distributions that selects the maximum element with probability 1/e, and we show that this is tight. We further show that if the observations arrive in a random order, this barrier of 1/e can be broken using a static threshold algorithm, and we show that our success probability is the best possible for any single-threshold algorithm under random observation order. Moreover, we prove that one can achieve a strictly better success probability using more general multi-threshold algorithms, unlike the non-random-order case. Along the way, we show that the best achievable success probability for the random-order case matches that of the i.i.d. case, which is approximately 0.5801, under a ``no-superstars'' condition that no single distribution is very likely ex ante to generate the maximum value. We also extend our results to the problem of selecting one of the k best values.
One of the main tools in our analysis is a suitable ``Poissonization'' of random order distributions, which uses Le Cam's theorem to connect the Poisson binomial distribution with the discrete Poisson distribution. This approach may be of independent interest.
View details

Parallel Graph Algorithms in Constant Adaptive Rounds: Theory meets Practice

Soheil Behnezhad

Warren J Schudy

VLDB 2020

Preview abstract
We study fundamental graph problems such as graph connectivity, minimum spanning forest (MSF), and approximate maximum (weight) matching in a distributed setting. In particular, we focus on the Adaptive Massively Parallel Computation (AMPC) model, which is a theoretical model that captures MapReduce-like computation augmented with a distributed hash table.
We show the first AMPC algorithms for all of the studied problems that run in a constant number of rounds and use only O(n^ϵ) space per machine, where 0<ϵ<1. Our results improve both upon the previous results in the AMPC model, as well as the best-known results in the MPC model, which is the theoretical model underpinning many popular distributed computation frameworks, such as MapReduce, Hadoop, Beam, Pregel and Giraph.
Finally, we provide an empirical comparison of the algorithms in the MPC and AMPC models in a fault-tolerant distriubted computation environment. We empirically evaluate our algorithms on a set of large real-world graphs and show that our AMPC algorithms can achieve improvements in both running time and round-complexity over optimized MPC baselines.
View details

Categorical Feature Compression via Submodular Optimization

Lin Chen

International Conference on Machine Learning (2019), pp. 515-523

Preview abstract
In modern machine learning tasks, the presence of categorical features with extremely large vocabularies is a reality. This becomes a bottleneck when using an ML model, which generally grows at least linearly with the vocabulary size, affecting the memory, training and inference costs, as well as overfitting risk. In this work, we seek to compress the vocabulary by maximizing the mutual information between the compressed categorical feature and the target binary labels. We note the relationship of this problem to that of quantization in a discrete memoryless channel, where there exists a quadratic-time algorithm to solve the problem. Unfortunately, such an algorithm does not scale to data sets with massive vocabularies and, in this paper, we develop a distributed quasi-linear O(n log n) algorithm with provable approximation guarantees. We first observe that although entropy is a submodular function, this is not the case for mutual information between a categorical feature and label. To tackle this problem, we define a set function over a different space, which still contains the optimal solution, and prove this function is submodular. We also provide a query oracle to the submodular function that runs in amortized logarithmic time, and is easy to compute in a distributed fashion. Combining these results with a greedy algorithm allows us to achieve a (1-1/e)-approximation in quasi-linear time. Finally, we compare our proposed algorithm to several existing approaches using the large-scale Criteo learning task and demonstrate better performance in retaining mutual information and also verify the learning performance of the compressed vocabulary.
View details

Massively Parallel Computation via Remote Memory Access

Laxman Dhulipala

Soheil Behnezhad

Warren Schudy

SPAA 2019

Preview abstract
We introduce the Adaptive Massively Parallel Computation (AMPC) model, which is an extension of the widely popular Massively Parallel Computation (MPC) model. At a high level, the AMPC model strengthens the MPC model by storing all messages sent within a round in a distributed data store. In the following round all machines are provided with random read access to the data store, subject to the same constraints on the total amount of communication as in the MPC model. Our model is inspired by the previous empirical studies of distributed graph algorithms using MapReduce and a distributed hash table service.
This extension allows us to give new graph algorithms with much lower round complexities compared to the best known solutions in the MPC model. In particular, in the AMPC model we show how to solve maximal independent set in O(1) rounds, and connectivity/minimum spanning tree in O(log log_{m/n} n) rounds, which is an exponential improvement upon the best known algorithms in the MPC model with sublinear space per machine. Our results imply that the 2-Cycle conjecture, the most popular hardness conjecture in the MPC model, does not hold in the AMPC model.
View details