Jump to Content

Ameya Velingker

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Locality-Aware Graph Rewiring in GNNs
    Federico Barbero
    Amin Saberi
    Michael Bronstein
    Francesco Di Giovanni
    ICLR 2024
    Preview abstract Graph Neural Networks (GNNs) are popular models for machine learning on graphs that typically follow the message-passing paradigm, whereby the feature of a node is updated recursively upon aggregating information over its neighbors. While exchanging messages over the input graph endows GNNs with a strong inductive bias, it can also make GNNs susceptible to \emph{over-squashing}, thereby preventing them from capturing long-range interactions in the given graph. To rectify this issue, {\em graph rewiring} techniques have been proposed as a means of improving information flow by altering the graph connectivity. In this work, we identify three desiderata for graph-rewiring: (i) reduce over-squashing, (ii) respect the locality of the graph, and (iii) preserve the sparsity of the graph. We highlight fundamental trade-offs that occur between {\em spatial} and {\em spectral} rewiring techniques; while the former often satisfy (i) and (ii) but not (iii), the latter generally satisfy (i) and (iii) at the expense of (ii). We propose a novel rewiring framework that satisfies all of (i)--(iii) through a locality-aware sequence of rewiring operations. We then discuss a specific instance of such rewiring framework and validate its effectiveness on several real-world benchmarks, showing that it either matches or significantly outperforms existing rewiring approaches. View details
    Preview abstract Many geographic information systems applications rely on the data provided by user devices in the road network. Such applications include traffic monitoring, driving navigation, detecting road closures or the construction of new roads, etc. This signal is collected by sampling locations from the user trajectories and is a critical process for all such systems. Yet, it has not been sufficiently studied in the literature. The most natural way to sample a trajectory is perhaps using a frequency based algorithm, e.g., sample every $x$ seconds. However, as we argue in this paper, such a simple strategy can be very wasteful in terms of resources (e.g., server-side processing, user battery) and in terms of the amount of user data that it maintains. In this work we conduct a horizontal study of various location sampling algorithms (including frequency-based, road geography-based, reservoir-sampling based, etc.) and extract their trade-offs in terms of various metrics of interest, such as, the size of the stored data and the induced quality of training for prediction tasks (e.g., predicting speeds) using the road network of New York City. View details
    Preview abstract Graph Neural Networks (GNNs) have emerged as a powerful technique for learning on relational data. Owing to the relatively limited number of message passing steps they perform—and hence a smaller receptive field—there has been significant interest in improving their expressivity by incorporating structural aspects of the underlying graph. In this paper, we explore the use of affinity measures as features in graph neural networks, in particular measures arising from random walks, including effective resistance, hitting and commute times. We propose message passing networks based on these features and evaluate their performance on a variety of node and graph property prediction tasks. Our architecture has low computational complexity, while our features are invariant to the permutations of the underlying graph. The measures we compute allow the network to exploit the connectivity properties of the graph, thereby allowing us to outperform relevant benchmarks for a wide variety of tasks, often with significantly fewer message passing steps. On one of the largest publicly available graph regression datasets, OGB-LSC-PCQM4Mv1, we obtain the best known single-model validation MAE at the time of writing. View details
    Preview abstract We study efficient approximation algorithms for the binary matrix factorization (BMF) problem, where the inputs are a matrix $\bA\in\{0,1\}^{n\times d}$ and a rank parameter $k>0$, which is typically a small integer and the goal is to approximate $\bA$ by outputting factors $\bU\in\{0,1\}^{n\times k}$ and $\bV\in\{0,1\}^{k\times d}$ that minimize the Frobenius loss $\|\bA-\bU\bV\|_F$. Currently, the state-of-the-art for this problem is the approximation algorithm of Kumar \etal~[ICML 2019], which achieves a $C$-approximation for some constant $C>576$. We give the first $(1+\eps)$-approximation algorithm using $2^{\tO{k^2/\eps^4}}\poly(n,d)$ runtime, which matches previous results, the new dependency on the desired approximation parameter $\eps>0$ notwithstanding. Our techniques generalize to other common variants of the BMF problem, admitting bicriteria $(1+\eps)$-approximation algorithms for $L_p$ loss functions, as well as the setting where all matrix operations are performed in $\mathbb{F}_2$. Moreover, our approach can be implemented in standard big data models, such as the streaming model or the distributed model. View details
    Exphormer: Sparse Transformers for Graphs
    Hamed Shirzad
    Balaji Venkatachalam
    Danica J. Sutherland
    Ali Sinop
    ICML 2023
    Preview abstract Graph transformers have emerged as a promising architecture for a variety of graph learning and representation tasks. Despite their successes, though, it remains challenging to scale graph transformers to large graphs while maintaining accuracy competitive with message-passing networks. In this paper, we introduce EXPHORMER, a framework for building powerful and scalable graph transformers. EXPHORMER consists of a sparse attention mechanism based on two mechanisms: virtual global nodes and expander graphs, whose mathematical characteristics, such as spectral expansion, pseduorandomness, and sparsity, yield graph transformers with complexity only linear in the size of the graph, while allowing us to prove desirable theoretical properties of the resulting transformer models. We show that incorporating EXPHORMER into the recentlyproposed GraphGPS framework produces models with competitive empirical results on a wide variety of graph datasets, including state-of-the-art results on three datasets. We also show that EXPHORMER can scale to datasets on larger graphs than shown in previous graph transformer architectures. Codes can be found at https://github.com/hamed1375/Exphormer. View details
    Linear Space Lower Bounds for Approximating q-ary CSPs in the Streaming Model
    Chi-Ning Chou
    Alexander Golovnev
    Madhu Sudan
    Santhoshini Velusamy
    STOC 2022 (2022)
    Preview abstract We consider the approximability of constraint satisfaction problems in the streaming setting. For every constraint satisfaction problem (CSP) on $n$ variables taking values in $\{0,\ldots,q-1\}$ we prove that improving over the trivial approximability by a factor of $q$ requires $\Omega(n)$ space even on instances with $O(n)$ constraints. We also identify a broad subclass of problems for which any improvement over the trivial approximability requires $\Omega(n)$ space. The key technical core is an optimal, $q^{-(k-1)}$-inapproximability for the case where every constraint is given by a system of $k-1$ linear equations $\bmod\; q$ over $k$ variables. Prior to our work, no such hardness was known for an approximation factor less than $1/2$ for {\em any} CSP. Our work builds on and extends the work of Kapralov and Krachun (Proc. STOC 2019) who showed a~linear lower bound on any non-trivial approximation of MAX CUTs in graphs. This corresponds roughly to the case of MAX $k$-LIN-$\bmod\; q$ with $k=q=2$. Each one of the extensions provides non-trivial technical challenges that we overcome in this work. View details
    Preview abstract We give the first polynomial time and sample (epsilon, delta)-differentially private (DP) algorithm to estimate the mean, covariance and higher moments in the presence of a constant fraction of adversarial outliers. Our algorithm succeeds for families of distributions that satisfy two well-studied properties in prior works on robust estimation: certifiable subgaussianity of directional moments and certifiable hypercontractivity of degree 2 polynomials. Our recovery guarantees hold in the “right affine-invariant norms”: Mahalanobis distance for mean, multiplicative spectral and relative Frobenius distance guarantees for covariance and injective norms for higher moments. Prior works obtained private robust algorithms for mean estimation of subgaussian distributions with bounded covariance. For covariance estimation, ours is the first efficient algorithm (even in the absence of outliers) that succeeds without any condition-number assumptions. Our algorithms arise from a new framework that provides a general blueprint for modifying convex relaxations for robust estimation to satisfy strong worst-case stability guarantees in the appropriate parameter norms whenever the algorithms produce witnesses of correctness in their run. We verify such guarantees for a modification of standard sum-of-squares (SoS) semidefinite programming relaxations for robust estimation. Our privacy guarantees are obtained by combining stability guarantees with a new “estimate dependent” noise injection mechanism in which noise scales with the eigenvalues of the estimated covariance. We believe this framework will be useful more generally in obtaining DP counterparts of robust estimators. Independently of our work, Ashtiani and Liaw [AL21] also obtained a polynomial time and sample private robust estimation algorithm for Gaussian distributions. View details
    Preview abstract An exciting new development in differential privacy is the shuffled model, in which an anonymous channel enables non-interactive, differentially private protocols with error much smaller than what is possible in the local model, while relying on weaker trust assumptions than in the central model. In this paper, we study basic counting problems in the shuffled model and establish separations between the error that can be achieved in the single-message shuffled model and in the shuffled model with multiple messages per user. For the problem of frequency estimation for n users and a domain of size B, we obtain: - A nearly tight lower bound of Ω~(min(n^(1/4), B^(1/2))) on the error in the single-message shuffled model. This implies that the protocols obtained from the amplification via shuffling work of Erlingsson et al. (SODA 2019) and Balle et al. (Crypto 2019) are essentially optimal for single-message protocols. A key ingredient in the proof is a lower bound on the error of locally-private frequency estimation in the low-privacy (aka high ϵ) regime. - Protocols in the multi-message shuffled model with poly(log B, log n) bits of communication per user and polylog(B) error, which provide an exponential improvement on the error compared to what is possible with single-message algorithms. For the related selection problem on a domain of size B, we prove: - A nearly tight lower bound of Ω(B) on the number of users in the single-message shuffled model. This significantly improves on the Ω(B^{1/17}) lower bound obtained by Cheu et al. (Eurocrypt 2019), and when combined with their O~(B^{1/2})-error multi-message protocol, implies the first separation between single-message and multi-message protocols for this problem. View details
    Pure Differentially Private Summation from Anonymous Messages
    Noah Golowich
    Rasmus Pagh
    Information Theoretic Cryptography (ITC) (2020), 15:1-15:23
    Preview abstract The shuffled (aka anonymous) model has recently generated significant interest as a candidate dis- tributed privacy framework with trust assumptions better than the central model but with achievable error rates smaller than the local model. In this paper, we study pure differentially private protocols in the shuffled model for summation, a very basic and widely used primitive. Specifically: • For the binary summation problem where each of n users holds a bit as an input, we give a pure ε- differentially private protocol for estimating the number of ones held by the users up to an absolute error of Oε(1), and where each user sends Oε(logn) messages each consisting of a single bit. This √ is the first pure differentially private protocol in the shuffled model with error o( n) for constant values of ε. Using our binary summation protocol as a building block, we give a pure ε-differentially private protocol that performs summation of real numbers (in [0,1]) up to an absolute error of Oε(1), and where each user sends Oε(log3 n) messages each consisting of O(loglogn) bits. • In contrast, we show that for any pure ε-differentially private protocol for binary summation in the shuffled model having absolute error n0.5−Ω(1), the per user communication has to be at least Ωε( log n) bits. This implies (i) the first separation between the (bounded-communication) multi- message shuffled model and the central model, and (ii) the first separation between pure and approximate differentially private protocols in the shuffled model. Interestingly, over the course of proving our lower bound, we have to consider (a generalization of) the following question which might be of independent interest: given γ ∈ (0, 1), what is the smallest positive integer m for which there exist two random variables X0 and X1 supported on {0, . . . , m} such that (i) the total variation distance between X0 and X1 is at least 1 − γ, and (ii) the moment generating functions of X0 and X1 are within a constant factor of each other everywhere? We show that the answer to this question is m = Θ( View details
    Oblivious Sketching of High-Degree Polynomial Kernels
    Thomas D. Ahle
    Michael Kapralov
    Jakob B. T. Knudsen
    Rasmus Pagh
    David P. Woodruff
    Amir Zandieh
    SODA (Symposium on Discrete Algorithms) (2020), pp. 141-160
    Preview abstract Kernel methods are fundamental tools in machine learning that allow detection of non-linear dependencies between data without explicitly constructing feature vectors in high dimensional spaces. A major disadvantage of kernel methods is their poor scalability: primitives such as kernel PCA or kernel ridge regression generally take prohibitively large quadratic space and (at least) quadratic time, as kernel matrices are usually dense. Some methods for speeding up kernel linear algebra are known, but they all invariably take time exponential in either the dimension of the input point set (e.g., fast multipole methods suffer from the curse of dimensionality) or in the degree of the kernel function. Oblivious sketching has emerged as a powerful approach to speeding up numerical linear algebra over the past decade, but our understanding of oblivious sketching solutions for kernel matrices has remained quite limited, suffering from the aforementioned exponential dependence on input parameters. Our main contribution is a general method for applying sketching solutions developed in numerical linear algebra over the past decade to a tensoring of data points without forming the tensoring explicitly. This leads to the first oblivious sketch for the polynomial kernel with a target dimension that is only polynomially dependent on the degree of the kernel function, as well as the first oblivious sketch for the Gaussian kernel on bounded datasets that does not suffer from an exponential dependence on the dimensionality of input data points. View details
    Private Aggregation from Fewer Anonymous Messages
    Rasmus Pagh
    Annual International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT) (2020), pp. 798-827
    Preview abstract Consider the setup where n parties are each given a number x_i in F_q and the goal is to compute the sum of x_i in a secure fashion and with as little communication as possible. We study this problem in the anonymized model of Ishai et al. (FOCS 2006) where each party may broadcast anonymous messages on an insecure channel. We present a new analysis of the one-round "split and mix" protocol of Ishai et al. In order to achieve the same security parameter, our analysis reduces the required number of messages by a Θ(log n) multiplicative factor. We complement our positive result with lower bounds showing that the dependence of the number of messages on the domain size, the number of parties, and the security parameter is essentially tight. Using a reduction of Balle et al. (2019), our improved analysis of the protocol of Ishai et al. yields, in the same model, an (ε, δ)-differentially private protocol for aggregation that, for any constant ε > 0 and any δ = 1 / poly(n), incurs only a constant error and requires only a constant number of messages per party. Previously, such a protocol was known only for Θ(log n) messages per party. View details
    No Results Found