Graph mining

Our mission is to build the most scalable library for graph algorithms and analysis and apply it to a multitude of Google products.

Our mission is to build the most scalable library for graph algorithms and analysis and apply it to a multitude of Google products.

About the team

We formalize data mining and machine learning challenges as graph problems and perform fundamental research in those fields leading to publications in top venues. Our algorithms and systems are used in a wide array of Google products such as Search, YouTube, AdWords, Play, Maps, and Social.

Team focus summaries

Large-Scale Clustering and Connected Components

Graph Neural Networks and Graph Embeddings

Large-Scale balanced partitioning

Large-Scale link modeling

Large-Scale similarity ranking

Public-private graph computation

Streaming and dynamic graph algorithms

Large-Scale centrality ranking

Large-Scale graph building

Graph-based sampling

ML compiler optimization

Featured publications

Talk like a Graph: Encoding Graphs for Large Language Models

Bahar Fatemi

Bryan Perozzi

Jonathan Halcrow

ICLR (2024)

Measuring Re-identification Risk

CJ Carey

Travis Dick

Alessandro Epasto

Adel Javanmard

Josh Karlin

Shankar Kumar

Andres Munoz Medina

Vahab Mirrokni

Gabriel Henrique Nunes

Sergei Vassilvitskii

Peilin Zhong

SIGMOD (2023)

Near-Optimal Private and Scalable k-Clustering

Vincent Pierre Cohen-addad

Alessandro Epasto

Vahab Mirrokni

Shyam Narayanan

Peilin Zhong

NeurIPS 2022 (2022)

Optimal Distributed Submodular Optimization via Sketching

MohammadHossein Bateni

Hossein Esfandiari

Vahab Mirrokni

Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2018), pp. 1138-1147

Tackling Provably Hard Representative Selection via Graph Neural Networks

Anton Tsitsulin

Bryan Perozzi

Hossein Esfandiari

Mehran Kazemi

Mohammad "Hossein" Bateni

Vahab Mirrokni

Deepak Ramachandran

Transactions on Machine Learning Research (2023)

TeraHAC: Hierarchical Agglomerative Clustering of Trillion-Edge Graphs

Jason Lee

Jakub Łącki

Laxman Dhulipala

Vahab Mirrokni

SIGMOD'24 (2023)

Massively Parallel Computation via Remote Memory Access

Hossein Esfandiari

Jakub Łącki

Laxman Dhulipala

Soheil Behnezhad

Vahab Mirrokni

Warren Schudy

SPAA 2019

Affinity Clustering: Hierarchical Clustering at Scale

MohammadHossein Bateni

Soheil Behnezhad

Mahsa Derakhshan

MohammadTaghi Hajiaghayi

Raimondas Kiveris

Silvio Lattanzi

Vahab Mirrokni

NIPS 2017, pp. 6867-6877

Distributed Balanced Partitioning via Linear Embedding

Kevin Aydin

Mohammadhossein Bateni

Vahab Mirrokni

Ninth ACM International Conference on Web Search and Data Mining (WSDM), ACM (2016), pp. 387-396

Distributed Graph Algorithmics: Theory and Practice

Silvio Lattanzi

Vahab S. Mirrokni

WSDM (2015), pp. 419-420

Grale: Designing Networks for Graph Learning

Jonathan Jesse Halcrow

Alexandru Moșoi

Sam Ruth

Bryan Perozzi

Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Association for Computing Machinery (2020), 2523–2532