Hossein Esfandiari

Hossein Esfandiari

Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract Representative Selection (RS) is the problem of finding a small subset of exemplars from a dataset that is representative of the dataset. In this paper, we study RS for attributed graphs, and focus on finding representative nodes that optimize the accuracy of a model trained on the selected representatives. Theoretically, we establish a new hardness result for RS (in the absence of a graph structure) by proving that a particular, highly practical variant of it (RS for Learning) is hard to approximate in polynomial time within any reasonable factor, which implies a significant potential gap between the optimum solution of widely-used surrogate functions and the actual accuracy of the model. We then study the setting where a (homophilous) graph structure is available, or can be constructed, between the data points. We show that with an appropriate modeling approach, the presence of such a structure can turn a hard RS (for learning) problem into one that can be effectively solved. To this end, we develop RS-GNN, a representation learning-based RS model based on Graph Neural Networks. Empirically, we demonstrate the effectiveness of RS-GNN on problems with predefined graph structures as well as problems with graphs induced from node feature similarities, by showing that RS-GNN achieves significant improvements over established baselines on a suite of eight benchmarks. View details
    Preview abstract We present new mechanisms for label differential privacy, a relaxation of differentially private machine learning that only protects the privacy of the labels in the training set. Our mechanisms cluster the examples in the training set using their (non-private) feature vectors, randomly re-sample each label from examples in the same cluster, and output a training set with noisy labels as well as a modified version of the true loss function. We prove that when the clusters are both large and high-quality, the model that minimizes the modified loss on the noisy training set converges to small excess risk at a rate that is comparable to the rate for non-private learning. We describe both a centralized mechanism in which the entire training set is stored by a trusted curator, and a distributed mechanism where each user stores a single labeled example and replaces her label with the label of a randomly selected user from the same cluster. We also describe a learning problem in which large clusters are necessary to achieve both strong privacy and either good precision or good recall. Our experiments show that randomizing the labels within each cluster significantly improves the privacy vs. accuracy trade-off compared to applying uniform randomized response to the labels, and also compared to learning a model via DP-SGD. View details
    Preview abstract Motivated by data analysis and machine learning applications, we consider the popular high-dimensional Euclidean $k$-median and $k$-means problems. We propose a new primal-dual algorithm, inspired by the classic algorithm of Jain and Vazirani and the recent algorithm of Ahmadian et al.. Our algorithm achieves an approximation ratio of respectively 2.40... and 5.95... for Euclidean $k$-median and $k$-means improving upon the 2.60... of Ahmadian et al. and the 6.12.. of Grandoni et al.. View details
    Preview abstract Metric clustering is a fundamental primitive in machine learning with several applications for mining massive data-sets. An important example of metric clustering is the $k$-center problem. While this problem has been extensively studied in distributed settings, all previous algorithms require $\Omega(k)$ space per machine and $\Omega(n k)$ total work. In this paper, we develop the first highly scalable approximation algorithm for $k$-center clustering requiring $o(k)$ space per machine with $o(n k)$ total work. In particular, our algorithm needs $\widetilde{O}(n^{\eps})$ space per machine and $\tilde{O}(n^{1+\epsilon})$ total work, and computes an $O(\log \log \log n)$-approximation of the problem by selecting $(1+o(1))k$ centers in $O(\log \log n)$ rounds. This is achieved by introducing core-sets of truly sublinear size. View details
    Preview abstract We study fundamental graph problems such as graph connectivity, minimum spanning forest (MSF), and approximate maximum (weight) matching in a distributed setting. In particular, we focus on the Adaptive Massively Parallel Computation (AMPC) model, which is a theoretical model that captures MapReduce-like computation augmented with a distributed hash table. We show the first AMPC algorithms for all of the studied problems that run in a constant number of rounds and use only O(n^ϵ) space per machine, where 0<ϵ<1. Our results improve both upon the previous results in the AMPC model, as well as the best-known results in the MPC model, which is the theoretical model underpinning many popular distributed computation frameworks, such as MapReduce, Hadoop, Beam, Pregel and Giraph. Finally, we provide an empirical comparison of the algorithms in the MPC and AMPC models in a fault-tolerant distriubted computation environment. We empirically evaluate our algorithms on a set of large real-world graphs and show that our AMPC algorithms can achieve improvements in both running time and round-complexity over optimized MPC baselines. View details
    Prophets, Secretaries, and Maximizing the Probability of Choosing the Best
    Brendan Lucier
    Michael Mitzenmacher
    MohammadTaghi Hajiaghayi
    AISTATS(2020) (to appear)
    Preview abstract Suppose a customer is faced with a sequence of fluctuating prices, such as for airfare or a product sold by a large online retailer. Given distributional information about what price they might face each day, how should they choose when to purchase in order to maximize the likelihood of getting the best price in retrospect? This is related to the classical secretary problem, but with values drawn from known distributions. In their pioneering work, Gilbert and Mosteller [J. Amer. Statist. Assoc. 1966] showed that when the values are drawn i.i.d., there is a thresholding algorithm that selects the best value with probability approximately 0.5801. However, the more general problem with non-identical distributions has remained unsolved. In this paper, we provide an algorithm for the case of non-identical distributions that selects the maximum element with probability 1/e, and we show that this is tight. We further show that if the observations arrive in a random order, this barrier of 1/e can be broken using a static threshold algorithm, and we show that our success probability is the best possible for any single-threshold algorithm under random observation order. Moreover, we prove that one can achieve a strictly better success probability using more general multi-threshold algorithms, unlike the non-random-order case. Along the way, we show that the best achievable success probability for the random-order case matches that of the i.i.d. case, which is approximately 0.5801, under a ``no-superstars'' condition that no single distribution is very likely ex ante to generate the maximum value. We also extend our results to the problem of selecting one of the k best values. One of the main tools in our analysis is a suitable ``Poissonization'' of random order distributions, which uses Le Cam's theorem to connect the Poisson binomial distribution with the discrete Poisson distribution. This approach may be of independent interest. View details
    Preview abstract Motivated by the increasing need to preserve privacy in digital devices, we introduce the on-device public-private model of computation. Our motivation comes from social-network based recommender systems in which the users want to receive recommendations based on the information available on their devices, as well as the suggestions of their social contacts, without sharing such information or contacts with the central recommendation system. Our model allows us to solve many algorithmic problems while providing absolute (deterministic) guarantees of the privacy of on-device data and the user's contacts. In fact, we ensure that the private data and private contacts are never revealed to the central system. Our restrictive model of computation presents several interesting algorithmic challenges because any computation based on private information and contacts must be performed on local devices of limited capabilities. Despite these challenges, under realistic assumptions of inter-device communication, we show several efficient algorithms for fundamental data mining and machine learning problems, ranging from k-means clustering to heavy hitters. We complement this analysis with strong impossibility results for efficient private algorithms without allowing inter-device communication. In our experimental evaluation, we show that our private algorithms provide results almost as accurate as those of the non-private ones while speeding up the on-device computations by orders of magnitude. View details
    Preview abstract We introduce the Adaptive Massively Parallel Computation (AMPC) model, which is an extension of the widely popular Massively Parallel Computation (MPC) model. At a high level, the AMPC model strengthens the MPC model by storing all messages sent within a round in a distributed data store. In the following round all machines are provided with random read access to the data store, subject to the same constraints on the total amount of communication as in the MPC model. Our model is inspired by the previous empirical studies of distributed graph algorithms using MapReduce and a distributed hash table service. This extension allows us to give new graph algorithms with much lower round complexities compared to the best known solutions in the MPC model. In particular, in the AMPC model we show how to solve maximal independent set in O(1) rounds, and connectivity/minimum spanning tree in O(log log_{m/n} n) rounds, which is an exponential improvement upon the best known algorithms in the MPC model with sublinear space per machine. Our results imply that the 2-Cycle conjecture, the most popular hardness conjecture in the MPC model, does not hold in the AMPC model. View details
    Online Pandora Boxes and Bandits
    MohammadTaghi Hajiaghayi
    Brendan Lucier
    Michael Mitzenmacher
    (2019)
    Preview abstract We consider online variations of the Pandora’s box problem [Weitzman 79], a standard model for understanding issues related to the cost of acquiring information for decision-making. Our problem generalizes both the classic Pandora’s box problem and the prophet inequality framework. Boxes are presented online, each with a random value and cost drawn jointly from some known distribution. Pandora chooses online whether to open each box given its cost, and then chooses irrevocably whether to keep the revealed prize or pass on it. We aim for approximation algorithms against adversaries that can choose the largest prize over any opened box, and use optimal offline policies to decide which boxes to open (without knowledge of the value inside). We consider variations where Pandora can collect multiple prizes subject to feasibility constraints, such as cardinality, matroid, or knapsack constraints. We also consider variations related to classic multi-armed bandit problems from reinforcement learning. Our results use a reduction-based framework where we separate the issues of the cost of acquiring information from the online decision process of which prizes to keep. Our work shows that in many scenarios, Pandora can achieve a good approximation to the best possible performance. View details