Pasin Manurangsi
Authored Publications
Sort By
Improved Inapproximability of VC Dimension and Littlestone’s Dimension via (Unbalanced) Biclique
ITCS 2023 (to appear)
Preview abstract
We study the complexity of computing (and approximating) VC Dimension and Littlestone's Dimension when we are given the concept class explicitly. We give a simple reduction from Maximum (Unbalanced) Biclique problem to approximating VC Dimension and Littlestone's Dimension. With this connection, we derive a range of hardness of approximation results and running time lower bounds. For example, under the (randomized) Gap-Exponential Time Hypothesis or the Strongish Planted Clique Hypothesis, we show a tight inapproximability result: both dimensions are hard to approximate to within a factor of o(log n) in polynomial-time. These improve upon constant-factor inapproximability results from [Manurangsi and Rubinstein, COLT 2017].
View details
Differentially Private All-Pairs Shortest Path Distances: Improved Algorithms and Lower Bounds
Jelani Osei Nelson
Justin Y. Chen
Shyam Narayanan
Yinzhan Xu
SODA 2023 (to appear)
Preview abstract
We study the problem of releasing the weights of all-pairs shortest paths in a weighted undirected graph with differential privacy (DP). In this setting, the underlying graph is fixed and two graphs are neighbors if their edge weights differ by at most 1 in the ℓ1-distance. We give an algorithm with additive error ̃O(n^2/3/ε) in the ε-DP case and an algorithm with additive error ̃O(√n/ε) in the (ε, δ)-DP case, where n denotes the number of vertices. This positively answers a question of Sealfon [Sea16, Sea20], who asked whether a o(n) error algorithm exists. We also show that an additive error of Ω(n1/6) is necessary for any sufficiently small ε, δ > 0.
Furthermore, we show that if the graph is promised to have reasonably bounded weights, one can improve the error further to roughly n^{(√17−3)/2+o(1)}/ε in the ε-DP case and roughly n^{√2−1+o(1)}/ε in the (ε, δ)-DP case. Previously, it was only known how to obtain ̃O(n2/3/ε1/3) additive error in the ε-DP case and ̃O(√n/ε) additive error in the (ε, δ)-DP case for bounded-weight graphs [Sea16].
Finally, we consider a relaxation where a multiplicative approximation is allowed. We show that, with a multiplicative approximation factor k, the additive error can be reduced to ̃O(n^{1/2+O(1/k)}/ε) in the ε-DP case and ̃O(n^{1/3+O(1/k)}/ε) in the (ε, δ)-DP case.
View details
Preview abstract
We study the fine-grained complexity of the famous $k$-center problem in the metric induced by a graph with $n$ vertices and $m$ edges. The problem is NP-hard to approximate within a factor strictly better than $2$, and several $2$-approximation algorithms are known.
Two of the most well-known approaches for the $2$-approximation are (1) finding a maximal distance $r$-independent set (where the minimum pairwise distance is greater than $r$) and (2) Gonzalez's algorithm that iteratively adds the center farthest from the currently chosen centers.
For the approach based on distance-$r$ independent sets, Thorup [SIAM J. Comput. '05] already gave a nearly linear time algorithm. While Thorup's algorithm is not complicated, it still requires tools such as an approximate oracle for neighborhood size by Cohen [J. Comput. Syst. Sci. '97]. Our main result is a nearly straightforward algorithm that improves the running time by an $O(\log n$) factor. It results in an $(2+\eps)$-approximation for $k$-center in $O((m + n \log n)\log n \log(n/\eps))$ time.
For Gonzalez's algorithm [Theor. Comput. Sci. 85], we show that the simple $\widetilde{O}(mk)$-time implementation is nearly optimal if we insist the {\em exact} implementation. On the other hand, we show that an $(1+\eps)$-approximate version of the algorithm is efficiently implementable, leading to an $(2+\eps)$-approximation algorithm in running time $O((m + n \log n)\log^2 n / \eps)$. We also show that, unlike in the distance $r$-independent set-based algorithm, the dependency of $1/\eps$ in the running time is essentially optimal for $(1 + \eps)$-approximate Gonzalez's.
View details
Preview abstract
Differential privacy is often applied with a privacy parameter that is larger than the theory suggests is ideal; various informal justifications for tolerating large privacy parameters have been proposed.
In this work, we consider partial differential privacy (DP), which allows quantifying the privacy guarantee on a per-attribute basis.
In this framework, we study several basic data analysis and learning tasks, and design algorithms whose per-attribute privacy parameter is smaller that the best possible privacy parameter for the entire record of a person (i.e., all the attributes).
View details
Preview abstract
Fairness and privacy are two important concerns in social decision-making processes such as resource allocation. We study privacy in the fair allocation of indivisible resources using the well-established framework of differential privacy. We present algorithms for approximate envy-freeness and proportionality when two instances are considered to be adjacent if they differ only on the utility of a single agent for a single item. On the other hand, we provide strong negative results for both fairness criteria when the adjacency notion allows the entire utility function of a single agent to change.
View details
Leveraging Bias-Variance Trade-offs for Regression with Label Differential Privacy
Ashwinkumar Badanidiyuru Varadaraja
Avinash Varadarajan
Chiyuan Zhang
Ethan Leeman
Pritish Kamath
NeurIPS 2023 (2023)
Preview abstract
We propose a new family of label randomization mechanisms for the task of training regression models under the constraint of label differential privacy (DP). In particular, we leverage the trade-offs between bias and variance to construct better noising mechanisms depending on a privately estimated prior distribution over the labels. We demonstrate that these mechanisms achieve state-of-the-art privacy-accuracy trade-offs on several datasets, highlighting the importance of bias-reducing constraints when training neural networks with label DP. We also provide theoretical results shedding light on the structural properties of the optimal bias-reduced mechanisms.
View details
Preview abstract
We consider the task of producing heatmaps from users' aggregated data while protecting their privacy. We give a differentially private algorithm for this task and demonstrate its advantages over previous algorithms on several real-world datasets.
Our core algorithmic primitive is a differentially private procedure that takes in a set of distributions and produces an output that is close in Earth Mover's Distance (EMD) to the average of the inputs. We prove theoretical bounds on the error of our algorithm under certain sparsity assumption and that these are essentially optimal.
View details
Preview abstract
In this work, we study the task of estimating the numbers of distinct and k-occurring items in a time window under the constraint of differential privacy (DP). We consider several variants depending on whether the queries are on general time windows (between times t1 and t2), or are restricted to being cumulative (between times 1 and t2), and depending on whether the DP neighboring relation is event-level or the more stringent item-level. We obtain nearly tight upper and lower bounds on the errors of DP algorithms for these problems. En route, we obtain an event-level DP algorithm for estimating, at each time step, the number of distinct items seen over the last W updates with error polylogarithmic in W; this answers an open question of Bolot et al. (ICDT 2013).
View details
Preview abstract
We introduce a new algorithm for numerical composition of privacy random variables, useful for computing the accurate privacy parameters for compositions of mechanisms.
For the task of self-composing a broad class of mechanisms $K$ times, this algorithm achieves a running time \& memory usage of $\polylog(K)$ (e.g., this class includes the sub-sampled Gaussian mechanism, that appears in the analysis of DP-SGD).
By comparison, recent work by Gopi et al. (NeurIPS 2021) has a running time of $\wtilde{O}(\sqrt{K})$ for the same task.
Our approach extends to the case of composing $K$ different mechanisms in the same class, improving upon the running time / memory usage in the work of Gopi et al. from $\wtilde{O}(K^{1.5})$ to $\wtilde{O}(K)$.
View details
Preview abstract
In this paper, we consider the problem of differentially private (DP) algorithms for isotonic regression. For the most general problem of isotonic regression over a partially ordered set (poset) X and for any Lipschitz loss function, we obtain a pure-DP algorithm that, given n input points, has an expected excess empirical risk of roughly width(X)⋅log|X|/n, where width(X) is the width of the poset. In contrast, we also obtain a near-matching lower bound of roughly (width(X)+log|X|)/n, that holds even for approximate-DP algorithms. Moreover, we show that the above bounds are essentially the best that can be obtained without utilizing any further structure of the poset.
In the special case of a totally ordered set and for ℓ1 and ℓ2^2 losses, our algorithm can be implemented in near-linear running time; we also provide extensions of this algorithm to the problem of private isotonic regression with additional structural constraints on the output function.
View details