Jump to Content
Morteza Zadimoghaddam

Morteza Zadimoghaddam

I am a staff research scientist (L6) at Google Cambridge office in the US. Prior to moving to Cambridge, I spent two years (2018 and 2019) in the Zurich office and 4 years in the New York office where I started my career at Google in January 2014. Prior to Google, I did my PhD in computer science at MIT (CSAIL) under supervision of Professor Erik D. Demaine. I work on applying optimization techniques to various practical problems in order to find provably efficient algorithms. In particular, I apply infrastructure optimization methods to save computational resources at scale. On the mathematical and research side, I am interested in Submodular Optimization and its applications in large scale data mining and machine learning problems.
Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, desc
  • Year
  • Year, desc
    Edge-Weighted Online Bipartite Matching
    Runzhou Tao
    Zhiyi Huang
    Journal of the ACM, vol. 69 (2022), 45:1-45:35
    Preview abstract Online bipartite matching is one of the most fundamental problems in the online algorithms literature. Karp, Vazirani, and Vazirani (STOC 1990) introduced an elegant algorithm for the unweighted problem that achieves an optimal competitive ratio of 1 - 1/e. Aggarwal et al. (SODA 2011) later generalized their algorithm and analysis to the vertex-weighted case. Little is known, however, about the most general edge-weighted problem aside from the trivial 1/2-competitive greedy algorithm. In this paper, we present the first online algorithm that breaks the long standing 1/2 barrier and achieves a competitive ratio of at least 0.5086. In light of the hardness result of Kapralov, Post, and Vondrák (SODA 2013) that restricts beating a 1/2 competitive ratio for the more general problem of monotone submodular welfare maximization, our result can be seen as strong evidence that edge-weighted bipartite matching is strictly easier than submodular welfare maximization in the online setting. The main ingredient in our online matching algorithm is a novel subroutine called online correlated selection (OCS), which takes a sequence of pairs of vertices as input and selects one vertex from each pair. Instead of using a fresh random bit to choose a vertex from each pair, the OCS negatively correlates decisions across different pairs and provides a quantitative measure on the level of correlation. We believe our OCS technique is of independent interest and will find further applications in other online optimization problems. View details
    Preview abstract Maximizing a monotone submodular function is a fundamental task in machine learning. In this paper we study the deletion robust version of the problem under the classic matroids constraint. Here the goal is to extract a small size summary of the dataset that contains a high value independent set even after an adversary deleted some elements. We present constant-factor approximation algorithms, whose space complexity depends on the rank $k$ of the matroid and the number $d$ of deleted elements. In the centralized setting we present a $(3.582+O(\eps))$-approximation algorithm with summary size $O(k + \frac{d \log k}{\eps^2})$. In the streaming setting we provide a $(5.582+O(\eps))$-approximation algorithm with summary size and memory $O(k + \frac{d \log k}{\eps^2})$. We complement our theoretical results with an in-depth experimental analysis showing the effectiveness of our algorithms on real-world datasets. View details
    Distributed load balancing: a new framework and improved guarantees
    Allen Liu
    Binghui Peng
    Innovations in Theoretical Computer Science (2021)
    Preview abstract Inspired by applications on search engines and web servers, we consider a load balancing problem with a general \textit{convex} objective function. In this problem, we are given a bipartite graph on a set of sources $S$ and a set of workers $W$ and the goal is to distribute the load from each source among its neighboring workers such that the total load of workers are as balanced as possible. We present a new distributed algorithm that works with \textit{any} symmetric non-decreasing convex function for evaluating the balancedness of the workers' load. Our algorithm computes a nearly optimal allocation of loads in $O(\log n \log^2 d/\eps^3)$ rounds where $n$ is the number of nodes, $d$ is the maximum degree, and $\eps$ is the desired precision. If the objective is to minimize the maximum load, we modify the algorithm to obtain a nearly optimal solution in $O(\log n \log d/\eps^2)$ rounds. This improves a line of algorithms that require a polynomial number of rounds in $n$ and $d$ and appear to encounter a fundamental barrier that prevents them from obtaining poly-logarithmic runtime \cite{berenbrink2005dynamic, berenbrink2009new, subramanian1994analysis, rabani1998local}. In our paper, we introduce a novel primal-dual approach with multiplicative weight updates that allows us to circumvent this barrier. Our algorithm is inspired by \cite{agrawal2018proportional} and other distributed algorithms for optimizing linear objectives but introduces several new twists to deal with general convex objectives. View details
    Preview abstract The sliding window model of computation captures scenarios in which data is arriving continuously, but only the latest $w$ elements should be used for analysis. The goal is to design algorithms that update the solution efficiently with each arrival rather than recomputing it from scratch. In this work, we focus on $k$-clustering problems such as $k$-means and $k$-median. In this setting, we give simple and practical algorithms that come with stronger performance guarantees than previously known results. Empirically, we show that our methods store only a small fraction of the data, are orders of magnitude faster, and find solutions with cost only slightly worse than those returned by algorithms that have access to the full dataset. View details
    Preview abstract In this paper, we provide efficient approximation algorithms for finding the most likelihood configuration (MAP) of size $k$ for Determinantal Point Processes (DPP) in the online and streaming settings where the data points arrive in an arbitrary order. More specifically, in the online setting, where the algorithm cannot discard the selected elements from its local memory, our Online-DPP algorithm achieves an $O(k^{O(k)})$ multiplicative approximation with $\eta$ additive error, using memory independent of the number of points. In the streaming setting, where the algorithm is allowed to discard the previously selected elements, our Stream-DPP algorithm achieves an $O(k^{O(k)})$ multiplicative approximation (and no additive error), with similar memory bounds. We note that the exponential dependence on $k$ in the approximation factor is unavoidable even in the offline setting. View details
    Fully Dynamic Algorithm for Constrained Submodular Optimization
    Ashkan Norouzi Fard
    Jakub Tarnawski
    Slobodan Mitrović
    NeurIPS 2020 (to appear)
    Preview abstract The task of maximizing a monotone submodular function under a cardinality constraint is at the core of many machine learning and data mining applications, including data summarization, sparse regression and coverage problems. We study this problem in the context of fully dynamic streams, where elements can be both inserted and removed. Our main result is a randomized algorithm that maintains an efficient data structure with a poly-logarithmic ammortized update time and returns a (1/2 - \epsilon)-approximate solution. We complement our theoretical analysis with an empirical study of the performance of our algorithm. View details
    Preview abstract Submodular optimization generalizes many classic problems in combinatorial optimization and has recently found a wide range of applications in machine learning (e.g., feature engineering and active learning). For many large-scale optimization problems, we are often concerned with the adaptivity complexity of an algorithm, which quantifies the number of sequential rounds where polynomially-many independent function evaluations can be executed in parallel. While low adaptivity is ideal, it is not sufficient for a distributed algorithm to be efficient, since in many practical applications of submodular optimization the number of function evaluations becomes prohibitively expensive. Motivated by these applications, we study the adaptivity and query complexity of adaptive submodular optimization. Our main result is a distributed algorithm for maximizing a monotone submodular function with cardinality constraint k that achieves a (1 − 1/e − ε)-approximation in expectation. This algorithm runs in O(log(n)) adaptive rounds and makes O(n) calls to the function evaluation oracle in expectation. The approximation guarantee and query complexity are optimal, and the adaptivity is nearly optimal. Moreover, the number of queries is substantially less than in previous works. We also extend our results to the submodular cover problem to demonstrate the generality of our algorithm and techniques. View details
    Preview abstract Maximizing diversity is a central problem in information re- trieval and data mining systems with prominent applications in web search, recommender systems, news aggregators and product search. In this paper, we study a diversity maxi- mization problem (a.k.a. maximum dispersion problem) in which given a set of n objects in a metric space, one wants to find a subset of k objects with the maximum sum of pairwise distances. To solve this problem in a scalable distributed manner, we apply a novel distributed framework for tackling large-scale problems known as randomized composable core-sets: divide the big data set into smaller parts, solve the problem for each part, combine the solutions from each part, and solve the problem on the union of these solutions. Our algorithms improve significantly over the approximation guarantees of state-of-the-art core-set-based algorithms while using min- imum possible intermediate output size. In particular, we present a simple distributed algorithm that achieves an al- most optimal communication complexity, and moreover, it asymptotically achieves approximation factor of 1/2 which is the best possible approximation factor for the global opti- mization problem under certain complexity theory assump- tions. Our algorithms are scalable and practical as shown by our extensive empirical evaluation with large datasets and they can be easily adapted to the major distributed comput- ing systems like MapReduce. Furthermore, we show empir- ically that, in real-life instances, our algorithms reach close- to-optimal solutions with approximation factor of > 90%. This approximation factor is far exceeding the approxima- tion barrier for the problem and provide useful output. View details
    Preview abstract Submodular maximization is a general optimization problem with a wide range of applications in machine learning (e.g., active learning, clustering, and feature selection). In large-scale optimization, the parallel running time of an algorithm is governed by its adaptivity, which measures the number of sequential rounds needed if the algorithm can execute polynomially-many independent oracle queries in parallel. While low adaptivity is ideal, it is not sufficient for an algorithm to be efficient in practice—there are many applications of distributed submodular optimization where the number of function evaluations becomes prohibitively expensive. Motivated by these applications, we study the adaptivity and query complexity of submodular maximization. In this paper, we give the first constant-factor approximation algorithm for maximizing a nonmonotone submodular function subject to a cardinality constraint k that runs in O(log(n)) adaptive rounds and makes O(n log(k)) oracle queries in expectation. In our empirical study, we use three real-world applications to compare our algorithm with several benchmarks for non-monotone submodular maximization. The results demonstrate that our algorithm finds competitive solutions using significantly fewer rounds and queries. View details
    Preview abstract We propose online algorithms for Column Subset Selection (CSS) and Principal Component Analysis (PCA), two methods that are widely employed for data analysis, summarization, and visualization. Given a data matrix A that is revealed one column at a time, the online CSS problems asks to keep a small set of columns, S, that best approximates the space spanned by the columns of A. As each column arrives, the algorithm must irrevocably decide whether to add it to S, or to ignore it. In the online PCA problem, the goal is to output a projection of each column to a low dimensional subspace. In other words, the algorithm must provide an embedding for each column as it arrives, which cannot be changed as new columns arrive. While both of these problems have been studied in the online setting, only additive approximations were known prior to our work. The core of our approach is an adaptive sampling technique that gives a practical and efficient algorithm for both of these problems. We prove that by sampling columns using their “residual norm” (i.e. their norm orthogonal to directions sampled so far), we end up with a significantly better dependence between the number of columns sampled, and the desired error in the approximation. We further show how to combine our algorithm “in series” with prior algorithms. In particular, using the results of Boutsidis et al. [4] and Frieze et al. [14] that have additive guarantees, we show how to improve the bounds on the error of our algorithm. View details
    Preview abstract This paper considers a traditional problem of resource allocation, scheduling jobs on machines. One such recent application is cloud computing, where jobs arrive in an online fashion with capacity requirements and need to be immediately scheduled on physical machines in data centers. It is often observed that the requested capacities are not fully utilized, hence offering an opportunity to employ an overcommitment policy, i.e., selling resources beyond capacity. Setting the right overcommitment level can induce a significant cost reduction for the cloud provider, while only inducing a very low risk of violating capacity constraints. We introduce and study a model that quantifies the value of overcommitment by modeling the problem as a bin packing with chance constraints. We then propose an alternative formulation that transforms each chance constraint into a submodular function. We show that our model captures the risk pooling effect and can guide scheduling and overcommitment decisions. We also develop a family of online algorithms that are intuitive, easy to implement and provide a constant factor guarantee from optimal. Finally, we calibrate our model using realistic workload data, and test our approach in a practical setting. Our analysis and experiments illustrate the benefit of overcommitment in cloud services, and suggest a cost reduction of 1.5% to 17% depending on the provider's risk tolerance. View details
    Better Sliding Window Algorithms to Maximize Subadditive and Diversity Objectives
    Michele Borassi
    Proceedings of the 38th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS 2019)
    Preview abstract The streaming computation model is a standard model for large-scale data analysis: the input arrives one element at a time, and the goal is to maintain an approximately optimal solution using only a constant, or, at worst, poly-logarithmic space. In practice, however, recency plays a large role, and one often wishes to consider only the last w elements that have arrived, the so-called sliding window problem. A trivial approach is to simply store the last w elements in a buffer; our goal is to develop algorithms with space and update time sublinear in w. In this regime, there are two frameworks: exponential histograms and smooth histograms, which can be used to obtain sliding window algorithms for families of functions satisfying certain properties. Unfortunately, these frameworks have limitations and cannot always be applied directly. A prominent example is the case of a submodular function with cardinality constraints. Some of these difficulties can be rectified, but often only on a case-by-case basis. Here, we describe an alternative approach to designing efficient sliding window algorithms for maximization problems. Then we instantiate this approach on a wide range of problems, yielding better algorithms for submodular function optimization, diversity optimization and general subadditive optimization. In doing so, we improve state-of-the art results obtained using problem-specific algorithms. View details
    Preview abstract Any streaming algorithm is commonly judged by the quality of its solution, the memory footprint, and its required amount of computations. In this paper, we study the problem of maximizing a monotone submodular function subject to a cardinality constraint k. We propose SIEVESTREAMING++, a streaming algorithm that with one pass over the data, while keeping only O(k) elements, achieves the tight 1/2 approximation guarantee. The best previously known streaming algorithms either achieve a suboptimal approximation ratio 1/4 with Θ(k) memory or the tight approximation ratio 1/2 with O(k log k) memory. We then show that by buffering a small fraction of the data stream, and through a careful filtering procedure, one can heavily reduce the rounds of adaptive computations, thus can lower the computational complexity of SIEVE-STREAMING++ substantially. We then generalize our results to the more challenging multi-source streaming setting. We show how one can achieve the tight 1/2 approximation guarantee with a O(k) shared memory, while minimizing not only the rounds of computations but also the total number of communicated bits. Finally, we demonstrate the efficiency of our algorithms on real-world applications including data summarization for multisource streams of tweets and of YouTube videos View details
    Consistent Hashing with Bounded Loads
    Mikkel Thorup
    Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms (2018), pp. 587-604
    Preview abstract Designing algorithms for balanced allocation of clients to servers in dynamic settings is a challenging problem for a variety of reasons. Both servers and clients may be added and/or removed from the system periodically, and the main objectives of allocation algorithms are: the uniformity of the allocation, and the number of moves after adding or removing a server or a client. The most popular solution for our dynamic settings is Consistent Hashing. However, the load balancing of consistent hashing is no better than a random assignment of clients to servers, so with n of each, we expect many servers to be overloaded with Θ(logn/loglogn) clients. In this paper, with n clients and n servers, we get a guaranteed max-load of 2 while only moving an expected constant number of clients for each update. We take an arbitrary user specified balancing parameter c=1+ϵ>1. With m balls and n bins in the system, we want no load above ⌈cm/n⌉. Meanwhile we want to bound the expected number of balls that have to be moved when a ball or server is added or removed. Compared with general lower bounds without capacity constraints, we show that in our algorithm when a ball or bin is inserted or deleted, the expected number of balls that have to be moved is increased only by a multiplicative factor O(1ϵ2) for ϵ≤1 (Theorem 4) and by a factor 1+O(logcc) for ϵ≥1 (Theorem 3). Technically, the latter bound is the most challenging to prove. It implies that we for superconstant c only pay a negligible cost in extra moves. We also get the same bounds for the simpler problem where we instead of a user specified balancing parameter have a fixed bin capacity C for all bins. View details
    Preview abstract Inspired by many application of bipartite matchings in online advertising and machine learning, we study a simple and natural iterative proportional allocation algorithm: Maintain a priority score $\priority_a$ for each node $a\in\advertisers$ on one side of the bipartition, initialized as $\priority_a=1$. Iteratively allocate each node $i\in \impressions$ on the other side to eligible nodes in $\advertisers$ in proportion of their priority scores. After each round, for each node $a\in \advertisers$, decrease or increase the score $\priority_a$ based on whether it is over- or under- allocated. Our first result is that this simple, distributed algorithm converges to a $(1-\epsilon)$-approximate fractional $b$-matching solution in $O({\log n\over \epsilon^2} )$ rounds. We also extend the proportional allocation algorithm and convergence results to the maximum weighted matching problem, and show that the algorithm can be naturally tuned to produce maximum matching with {\em high entropy}. High entropy, in turn, implies additional desirable properties of this matching, e.g., it satisfies certain diversity and fairness (aka anonymity) properties that are desirable in a variety of applications in online advertising and machine learning. View details
    Scalable Deletion-Robust Submodular Maximization: Data Summarization with Privacy and Fairness Constraints
    Amin Karbasi
    Ehsan Kazemi
    Thirty-fifth International Conference on Machine Learning, ICML 2018
    Preview abstract Can we efficiently extract useful information from a large user-generated dataset while protecting the privacy of the users and/or ensuring fairness in representation. We cast this problem as an instance of a deletion-robust submodular maximization where part of the data may be deleted due to privacy concerns or fairness criteria. We propose the first memory-efficient centralized, streaming, and distributed methods with constant-factor approximation guarantees against \textit{any} number of adversarial deletions. We extensively evaluate the performance of our algorithms against prior state-of-the-art on real-world applications, including (i) Uber-pick up locations with location privacy constraints; (ii) feature selection with fairness constraints for income prediction and crime rate prediction; and (iii) robust to deletion summarization of census data, consisting of 2,458,285 feature vectors. View details
    Preview abstract In the submodular welfare maximization (SWM) problem, the input consists of a set of n items, each of which must be allocated to one of m agents. Each agent ell has a valuation function v_ell, where v_ell(S) denotes the welfare obtained by this agent if she receives the set of items S. The functions v_ell are all submodular; as is standard, we assume that they are monotone and v_ell(∅) = 0. The goal is to partition the items into m disjoint subsets S1, S2, . . . , Sm in order to maximize the social welfare, defined as \Sum_ell v_ell(S_ell). A simple greedy algorithm gives a 1/2-approximation to SWM in the offline setting, and this was the best known until Vondr´ak’s recent (1 − 1/e)-approximation algorithm. In this paper, we consider the online version of SWM. Here, items arrive one at a time in an online manner; when an item arrives, the algorithm must make an irrevocable decision about which agent to assign it to before seeing any subsequent items. This problem is motivated by applications to Internet advertising, where user ad impressions must be allocated to advertisers whose value is a submodular function of the set of users/impressions they receive. There are two natural models that differ in the order in which items arrive. In the fully adversarial setting, an adversary can construct an arbitrary/worst-case instance, as well as pick the order in which items arrive in order to minimize the algorithm’s performance. In this setting, the 1/2-competitive greedy algorithm is the best possible. To improve on this, one must weaken the adversary slightly: In the random order model, the adversary can construct a worst-case set of items and valuations but does not control the order in which the items arrive; instead, they are assumed to arrive in a random order. The random order model has been well studied for online SWM and various special cases, but the best known competitive ratio (even for several special cases) is 1/2 + 1/n, which is barely better than the ratio for the adversarial order. Obtaining a competitive ratio of 1/2 + Ω(1) for the random order model has been an important open problem for several years. We solve this open problem by demonstrating that the greedy algorithm has a competitive ratio of at least 0.505 for online SWM in the random order model. This is the first result showing a competitive ratio bounded above 1/2 in the random order model, even for special cases such as the weighted matching or budgeted allocation problem (without the so-called large capacity assumptions). For special cases of submodular functions including weighted matching, weighted coverage functions, and a broader class of “second-order supermodular” functions, we provide a different analysis that gives a competitive ratio of 0.51. We analyze the greedy algorithm using a factor-revealing linear program, bounding how the assignment of one item can decrease potential welfare from assigning future items. In addition to our new competitive ratios for online SWM, we make two further contributions: First, we define the classes of second-order modular, supermodular, and submodular functions, which are likely to be of independent interest in submodular optimization. Second, we obtain an improved competitive ratio via a technique we refer to as gain linearizing, which may be useful in other contexts: Essentially, we linearize the submodular function by dividing the gain of an optimal solution into gain from individual elements, compare the algorithm’s gain when it assigns an element to the optimal solution’s gain from the element, and, crucially, bound the extent to which assigning elements can affect the potential gain of other elements. View details
    Data Summarization at Scale: A Two-Stage Submodular Approach
    Amin Karbasi
    Ehsan Kazemi
    Marko Mitrovic
    Proceedings of the 35th International Conference on Machine Learning, ICML 2018, PMLR
    Preview abstract The sheer scale of modern datasets has resulted in a dire need for summarization techniques that identify representative elements in a dataset, and then use them to construct a drastically smaller subset which still encodes a similar amount of information. Fortunately, the vast majority of data summarization tasks satisfy an intuitive diminishing returns condition known as submodularity, which allows us to find nearly-optimal solutions in linear time. We focus on a two-stage submodular framework where the goal is to use some given training functions to reduce the ground set so that optimizing new functions (drawn from the same distribution) over the reduced set provides almost as much value as optimizing them over the entire ground set. In this paper, we develop the first streaming and distributed solutions to this problem. In addition to providing strong theoretical guarantees, we demonstrate both the utility and efficiency of our algorithms on real-world tasks including image summarization and ride-share optimization. View details
    Preview abstract Feature selection is a fundamental problem in machine learning and data mining. The majority of feature selection algorithms are designed for running on a single machine (centralized setting) and they are less applicable to very large datasets. Although there are some distributed methods to tackle this problem, most of them are distributing the data horizontally which are not suitable for datasets with a large number of features and few number of instances. Thus, in this paper, we introduce a novel vertically distributable feature selection method in order to speed up this process and be able to handle very large datasets in a scalable manner. In general, feature selection methods aim at selecting relevant and non-redundant features (Minimum Redundancy and Maximum Relevance). It is much harder to consider redundancy in a vertically distributed setting than a centralized setting since there is no global access to the whole data. To the best of our knowledge, this is the first attempt toward solving the feature selection problem with a vertically distributed filter method which handles the redundancy with consistently comparable results with centralized methods. In this paper, we formalize the feature selection problem as a diversity maximization problem by introducing a mutual-information-based metric distance on the features. We show the effectiveness of our method by performing an extensive empirical study. In particular, we show that our distributed method outperforms state-of-the-art centralized feature selection algorithms on a variety of datasets. From a theoretical point of view, we have proved that the used greedy algorithm in our method achieves an approximation factor of 1/4 for the diversity maximization problem in a distributed setting with high probability. Furthermore, we improve this to 8/25 expected approximation using multiplicity in our distribution. View details
    Preview abstract We study the problem of efficiently optimizing submodular functions under cardinality constraints in distributed setting. Recently, several distributed algorithms for this problem have been introduced which either achieve a sub-optimal solution or they run in super-constant number of rounds of computation. Unlike previous work, we aim to design distributed algorithms in multiple rounds with almost optimal approximation guarantees at the cost of outputting a larger number of elements. Toward this goal, we present a distributed algorithm that, for any \epsilon > 0 and any constant r, outputs a set S of O(rk/\epsilon^{1\over r}) items in r rounds, and achieves a (1-\epsilon)-approximation of the value of the optimum set with k items. This is the first distributed algorithm that achieves an approximation factor of (1-\epsilon) running in less than $\log {1\over \epsilon}$ number of rounds. We also prove a hardness result showing that the output of any $1-\epsilon$ approximation distributed algorithm limited to one distributed round should have at least $\Omega(k/\epsilon)$ items. In light of this hardness result, our distributed algorithm in one round, $r=1$, is asymptotically tight in terms of the output size. We support the theoretical guarantees with an extensive empirical study of our algorithm showing that achieving almost optimum solutions is indeed possible in a few rounds for large-scale real datasets. View details
    Overcommitment in Cloud Services – Bin Packing with Chance Constraints
    Maxime Cohen
    Phil Keller
    ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems 2017
    Preview abstract This paper considers the traditional problem of resource allocation, i.e., scheduling jobs on machines. One such recent application is cloud computing, where jobs arrive in an online fashion with capacity requirements and need to be allocated to physical machines. It is often observed that the requested capacities are not fully utilized, hence offering an opportunity to employ an overcommitment policy, i.e., selling resources beyond capacity. Setting the right overcommitment level can induce a significant cost reduction for the cloud provider, while taking a very low risk of violating capacity constraints. We introduce and study a model that quantifies the value of overcommitment by modeling the problem as a bin packing with chance constraints. We then propose an alternative formulation that transforms each chance constraint into a sub-modular function. We show that our model captures the risk pooling effect and allows to guide scheduling and overcommitment decisions. We also develop a family of online algorithms that are intuitive, easy to implement and provide a constant factor guarantee from optimal. Finally, we calibrate our model using realistic workload data and test our approach in a practical setting. Our analysis and experiments illustrate the benefit of overcommitting in cloud services. View details
    Bicriteria Distributed Submodular Maximization in a Few Rounds
    Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures (2017)
    Preview abstract We study the problem of efficiently optimizing submodular functions under cardinality constraints in distributed setting. Recently, several distributed algorithms for this problem have been introduced which either achieve a sub-optimal solution or they run in super-constant number of rounds of computation. Unlike previous work, we aim to design distributed algorithms in multiple rounds with almost optimal approximation guarantees at the cost of outputting a larger number of elements. Toward this goal, we present a distributed algorithm that, for any ε > 0 and any constant r, outputs a set S of O(rk/ε^(1/r)) items in r rounds, and achieves a (1-ε)-approximation of the value of the optimum set with k items. This is the first distributed algorithm that achieves an approximation factor of (1-ε) running in less than log 1/ε number of rounds. We also prove a hardness result showing that the output of any 1-ε approximation distributed algorithm limited to one distributed round should have at least Ω(k/ε) items. In light of this hardness result, our distributed algorithm in one round, r = 1, is asymptotically tight in terms of the output size. We support the theoretical guarantees with an extensive empirical study of our algorithm showing that achieving almost optimum solutions is indeed possible in a few rounds for large-scale real datasets. View details
    Preview abstract Maximizing submodular functions under cardinality constraints lies at the core of numerous data mining and machine learning applications, including data diversification, data summarization, and coverage problems. In this work, we study this question in the context of data streams, where elements arrive one at a time, and we want to design low-memory and fast update-time algorithms that maintain a good solution. Specifically, we focus on the sliding window model, where we are asked to maintain a solution that considers only the last $W$ items. In this context, we provide the first non-trivial algorithm that maintains a provable approximation of the optimum using space sublinear in the size of the window. In particular we give a $\nicefrac{1}{3} - \epsilon$ approximation algorithm that uses space polylogarithmic in the spread of the values of the elements, $\Spread$, and linear in the solution size $k$ for any constant $\epsilon > 0$. At the same time, processing each element only requires a polylogarithmic number of evaluations of the function itself. When a better approximation is desired, we show a different algorithm that, at the cost of using more memory, provides a $\nicefrac{1}{2} - \epsilon$ approximation, and allows a tunable trade-off between average update time and space. This algorithm matches the best known approximation guarantees for submodular optimization in insertion-only streams, a less general formulation of the problem. We demonstrate the efficacy of the algorithms on a number of real world datasets, showing that their practical performance far exceeds the theoretical bounds. The algorithms preserve high quality solutions in streams with millions of items, while storing a negligible fraction of them. View details
    Preview abstract A variety of large-scale machine learning problems can be cast as instances of constrained submodular maximization. Existing approaches for distributed submodular maximization have a critical drawback: The capacity – the number of instances that can fit in memory of each machine – must grow with the data set size. In practice, while one can provision many machines, the capacity of each machine is limited by physical constraints. We propose a truly scalable approach for distributed submodular maximization when the capacity of each machine is fixed. The proposed framework applies to a broad class of algorithms and a variety of constraints. We provide theoretical guarantees on the approximation factor for any available capacity. We empirically evaluate the proposed algorithm on a variety of data sets and demonstrate that the algorithm achieves performance competitive with the centralized greedy solution. View details
    Preview abstract The problem of matrix column subset selection has recently attracted a large body of research, with feature selection serving as one obvious and important application. Among the techniques that have been applied to solve this problem, the greedy algorithm has been shown to be quite effective in practice. However, our theoretical guarantees on its performance have not been ex- plored thoroughly, especially in a distributed set- ting. In this paper, we study the greedy algorithm for the column subset selection problem from a theoretical and empirical perspective and show its effectiveness in a distributed setting. In par- ticular, we provide an improved approximation guarantee for the greedy algorithm, and present the first distributed implementation of this algo- rithm with provable approximation factors. We use the idea of randomized composable core- sets, developed recently in the context of sub- modular maximization. Finally, we validate the effectiveness of this distributed algorithm via an empirical study. View details
    Preview abstract In the Submodular Welfare Maximization (SWM) problem, the input consists of a set of n items, each of which must be allocated to one of m agents. Each agent l has a valuation function vl, where vl(S) denotes the welfare obtained by this agent if she receives the set of items S. The functions vl are all submodular; as is standard, we assume that they are monotone and vl(∅) = 0. The goal is to partition the items into m disjoint subsets S1, S2, ... Sm in order to maximize the social welfare, defined as ∑l = 1m vl(Sl). A simple greedy algorithm gives a 1/2-approximation to SWM in the offline setting, and this was the best known until Vondrak's recent (1-1/e)-approximation algorithm [34]. In this paper, we consider the online version of SWM. Here, items arrive one at a time in an online manner; when an item arrives, the algorithm must make an irrevocable decision about which agent to assign it to before seeing any subsequent items. This problem is motivated by applications to Internet advertising, where user ad impressions must be allocated to advertisers whose value is a submodular function of the set of users / impressions they receive. There are two natural models that differ in the order in which items arrive. In the fully adversarial setting, an adversary can construct an arbitrary / worst-case instance, as well as pick the order in which items arrive in order to minimize the algorithm's performance. In this setting, the 1/2-competitive greedy algorithm is the best possible. To improve on this, one must weaken the adversary slightly: In the random order model, the adversary can construct a worst-case set of items and valuations, but does not control the order in which the items arrive; instead, they are assumed to arrive in a random order. The random order model has been well studied for online SWM and various special cases, but the best known competitive ratio (even for several special cases) is 1/2 + 1/n [9,10], barely better than the ratio for the adversarial order. Obtaining a competitive ratio of 1/2 + Ω(1) for the random order model has been an important open problem for several years. We solve this open problem by demonstrating that the greedy algorithm has a competitive ratio of at least 0.505 for online SWM in the random order model. This is the first result showing a competitive ratio bounded above 1/2 in the random order model, even for special cases such as the weighted matching or budgeted allocation problems (without the so-called 'large capacity' assumptions). For special cases of submodular functions including weighted matching, weighted coverage functions and a broader class of "second-order supermodular" functions, we provide a different analysis that gives a competitive ratio of 0.51. We analyze the greedy algorithm using a factor-revealing linear program, bounding how the assignment of one item can decrease potential welfare from assigning future items. We also formulate a natural conjecture which, if true, would improve the competitive ratio of the greedy algorithm to at least 0.567. In addition to our new competitive ratios for online SWM, we make two further contributions: First, we define the classes of second-order modular, supermodular, and submodular functions, which are likely to be of independent interest in submodular optimization. Second, we obtain an improved competitive ratio via a technique we refer to as gain linearizing, which may be useful in other contexts (see [26]): Essentially, we linearize the submodular function by dividing the gain of an optimal solution into gain from individual elements, compare the gain when it assigns an element to the optimal solution's gain from the element, and, crucially, bound the extent to which assigning elements can affect the potential gain of other elements. View details
    Preview abstract An effective technique for solving optimization problems over massive data sets is to partition the data into smaller pieces, solve the problem on each piece and compute a representative solution from it, and finally obtain a solution inside the union of the representative solutions for all pieces. This technique can be captured via the concept of composable core-sets, and has been recently applied to solve diversity maximization problems as well as several clustering problems [7,15,8]. However, for coverage and submodular maximization problems, impossibility bounds are known for this technique [15]. In this paper, we focus on efficient construction of a randomized variant of composable core-sets where the above idea is applied on a random clustering of the data. We employ this technique for the coverage, monotone and non-monotone submodular maximization problems. Our results significantly improve upon the hardness results for non-randomized core-sets, and imply improved results for submodular maximization in a distributed and streaming settings. The effectiveness of this technique has been confirmed empirically for several machine learning applications [22], and our proof provides a theoretical foundation to this idea. In summary, we show that a simple greedy algorithm results in a 1/3-approximate randomized composable core-set for submodular maximization under a cardinality constraint. Our result also extends to non-monotone submodular functions, and leads to the first 2-round MapReduce-based constant-factor approximation algorithm with O(n) total communication complexity for either monotone or non-monotone functions. Finally, using an improved analysis technique and a new algorithm PseudoGreedy, we present an improved 0.545-approximation algorithm for monotone submodular maximization, which is in turn the first MapReduce-based algorithm beating factor 1/2 in a constant number of rounds. View details
    Submodular secretary problems with extensions
    MohammadTaghi Hajiaghayi
    ACM Transactions on Algorithms, vol. 9 (4) (2013)
    Preview
    Submodular secretary problems with extensions
    MohammadTaghi Hajiaghayi
    ACM Transactions on Algorithms, vol. 9 (4) (2013)
    Preview abstract Online auction is the essence of many modern markets, particularly networked markets, in which information about goods, agents, and outcomes is revealed over a period of time, and the agents must make irrevocable decisions without knowing future information. Optimal stopping theory, especially the classic secretary problem, is a powerful tool for analyzing such online scenarios which generally require optimizing an objective function over the input. The secretary problem and its generalization the multiple-choice secretary problem were under a thorough study in the literature. In this article, we consider a very general setting of the latter problem called the submodular secretary problem, in which the goal is to select k secretaries so as to maximize the expectation of a (not necessarily monotone) submodular function which defines efficiency of the selected secretarial group based on their overlapping skills. We present the first constant-competitive algorithm for this case. In a more general setting in which selected secretaries should form an independent (feasible) set in each of $l$ given matroids as well, we obtain an O(l log^2 r)-competitive algorithm generalizing several previous results, where $r$ is the maximum rank of the matroids. Another generalization is to consider $l$ knapsack constraints (i.e., a knapsack constraint assigns a nonnegative cost to each secretary, and requires that the total cost of all the secretaries employed be no more than a budget value) instead of the matroid constraints, for which we present an O(l)-competitive algorithm. In a sharp contrast, we show for a more general setting of subadditive secretary problem, there is no o~(\sqrt n) competitive algorithm and thus submodular functions are the most general functions to consider for constant-competitiveness in our setting. We complement this result by giving a matching O(\sqrt n) competitive algorithm for the subadditive case. At the end, we consider some special cases of our general setting as well. View details
    Revenue Maximization with Nonexcludable Goods
    Nima Haghpanah
    Internet and Network Economics - 9th International Workshop, WINE 2013, Springer
    Preview
    Preview abstract Inspired by online ad allocation problems, many results have been developed for online matching problems. Most of the previous work deals with a single objective, but, in practice, there is a need to optimize multiple objectives. Here, as an illustrative example motivated by display ads allocation, we study a bi-objective online matching problem. In particular, we consider a set of fixed nodes (ads) with capacity constraints, and a set of online items (pageviews) arriving one by one. Upon arrival of an online item $i$, a set of eligible fixed neighbors (ads) for the item is revealed, together with a weight $w_{ia}$ for eligible neighbor $a$. The problem is to assign each item to an eligible neighbor online, while respecting the capacity constraints; the goal is to maximize both the total weight of the matching and the cardinality. In this paper, we present both approximation algorithms and hardness results for this problem. An $(\alpha, \beta)$-approximation for this problem is a matching with weight at least $\alpha$ fraction of the maximum weighted matching, and cardinality at least $\beta$ fraction of maximum cardinality matching. We present a parametrized approximation algorithm that allows a smooth tradeoff curve between the two objectives: when the capacities of fixed nodes are large, we give a $p(1- 1/e^{1/p}), (1-p)(1-1/e^{1/1-p})$-approximation for any $0 \le p \le 1$, and prove a `hardness curve' combining several inapproximability results. These upper and lower bounds are always close (with a maximum gap of $9\%$), and exactly coincide at the point $(0.43, 0.43)$. For small capacities, we present a smooth parametrized approximation curve for the problem between $(0,1-1/e)$ and $(1/2,0)$ passing through a $(1/3,0.3698)$-approximation. View details
    Preview abstract Motivated by online ad allocation, we study the problem of simultaneous approximations for the adversarial and stochastic online budgeted allocation problem. This problem consists of a bipartite graph G = (X, Y, E), where the nodes of Y along with their corresponding capacities are known beforehand to the algorithm, and the nodes of X arrive online. When a node of X arrives, its incident edges, and their respective weights are revealed, and the algorithm can match it to a neighbor in Y. The objective is to maximize the weight of the final matching, while respecting the capacities. When nodes arrive in an adversarial order, the best competitive ratio is known to be 1 - 1/e, and it can be achieved by the Ranking [18], and its generalizations (Balance [16, 21]). On the other hand, if the nodes arrive through a random permutation, it is possible to achieve a competitive ratio of 1 -- ε [9]. In this paper we design algorithms that achieve a competitive ratio better than 1 -- 1/e on average, while preserving a nearly optimal worst case competitive ratio. Ideally, we want to achieve the best of both worlds, i.e, to design an algorithm with the optimal competitive ratio in both the adversarial and random arrival models. We achieve this for unweighted graphs, but show that it is not possible for weighted graphs. In particular, for unweighted graphs, under some mild assumptions, we show that Balance achieves a competitive ratio of 1 -- ε in a random permutation model. For weighted graphs, however, we prove this is not possible; we prove that no online algorithm that achieves an approximation factor of 1 -- 1/ε for the worst-case inputs may achieve an average approximation factor better than 97.6% for random inputs. In light of this hardness result, we aim to design algorithms with improved approximation ratios in the random arrival model while preserving the competitive ratio of 1 -- 1/ε in the worst case. To this end, we show the algorithm proposed by [21] achieves a competitive ratio of 0.76 for the random arrival model, while having a 1 -- 1/ε ratio in the worst case. View details
    Permutation betting markets: singleton betting with extra information
    Mohammad Ghodsi
    Hamid Mahini
    ACM Conference on Electronic Commerce (2008), pp. 180-189
    Preview
    No Results Found