Jump to Content

Anton Tsitsulin

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, desc
  • Year
  • Year, desc
    GraphWorld: Fake Graphs Bring Real Insights for GNNs
    Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2022)
    Preview abstract The continuing maturity of the deep learning subfield of graph neural networks (GNNs) has motivated recent studies into the standard datasets used to benchmark GNNs. While important improvements have been made to GNN datasets and experimental design, any one dataset provides only a singular, potentially spurious insight into the performance of any GNN being tested. We show that state-of-the-art GNN task datasets do not cover the distribution of graphs in a much larger real-data graph repository, with respect to several key graph metrics. Motivated by this finding, we introduce GraphWorld, a novel distributed framework and software package for testing GNN models on an arbitrarily-large population of \emph{synthetic} task datasets. GraphWorld allows a user to efficiently generate a \emph{world} of millions of graph datasets, with fine-grained control over graph generator parameters, and benchmark arbitrary GNN models, with built-in hyperparameter tuning. Using GraphWorld to generate diverse graph worlds corresponding to node classification, graph classification, and link prediction tasks, we provide insight into the sensitivity of 10,000+ GNN models to various parameters of graphs and node features and} show comparisons between models that have not been possible to make in any previous work. We also introduce a novel metric with which to explore each models' performance on the graph world, conditioning on graph metrics and graph generator parameters. View details
    Preview abstract Personalized PageRank (PPR) is a fundamental tool in unsupervised learning of graph representations such as node ranking, labeling, and graph embedding. However, while data privacy is one of the most important recent concerns, existing PPR algorithms are not designed to protect user privacy. PPR is highly sensitive to the input graph edges: the difference of only one edge may cause a big change in the PPR vector, potentially leaking private user data. In this work, we propose an algorithm which outputs an approximate PPR and has provably bounded sensitivity to input edges. In addition, we prove that our algorithm achieves similar accuracy to non-private algorithms when the input graph has large degrees. Our sensitivity-bounded PPR directly implies private algorithms for several tools of graph learning, such as, differentially private (DP) PPR ranking, DP node classification, and DP node embedding. To complement our theoretical analysis, we also empirically verify the practical performances of our algorithms. View details
    Preview abstract Graph Neural Networks (GNNs) have achieved state-of-the-art results on many graph analysis tasks such as node classification and link prediction. However, important unsupervised problems on graphs, such as graph clustering, have proved more resistant to advances in GNNs. In this paper, we study unsupervised training of GNN pooling in terms of their clustering capabilities. We start by drawing a connection between graph clustering and graph pooling: intuitively, a good graph clustering is what one would expect from a GNN pooling layer. Counterintuitively, we show that this is not true for state-of-the-art pooling methods, such as MinCut pooling. To address these deficiencies, we introduce Deep Modularity Networks (DMoN), an unsupervised pooling method inspired by the modularity measure of clustering quality, and show how it tackles recovery of the challenging clustering structure of real-world graphs. In order to clarify the regimes where existing methods fail, we carefully design a set of experiments on synthetic data which show that DMoN is able to jointly leverage the signal from the graph structure and node attributes. Similarly, on real-world data, we show that DMoN produces high quality clusters which correlate strongly with ground truth labels, achieving state-of-the-art results. View details
    No Results Found