Position Paper: Graph Learning Loses Relevance Due To Poor Benchmarks

Maya Bechler-Speicher
Ben Finkelshtein
Fabrizio Frasca
Luis Müller
Jan Tönshoff
Antoine Siraudin
Viktor Zaverkin
Michael Bronstein
Mathias Niepert
Michael Galkin
Christopher Morris
2025

Abstract

While machine learning on graphs has demonstrated promise in drug design and molecular property prediction, significant benchmarking challenges hinder its further progress and relevance. Current benchmarking practices often lack focus on transformative, real-world applications, favoring narrow domains like two-dimensional molecular graphs over broader, impactful areas such as combinatorial optimization, relational databases, or chip design. Additionally, many benchmark datasets poorly represent the underlying data, leading to inadequate abstractions and misaligned use cases. Fragmented evaluations and an excessive focus on accuracy further exacerbate these issues, incentivizing overfitting rather than fostering generalizable insights. These limitations have prevented the development of truly useful graph foundation models. This position paper calls for a paradigm shift toward more meaningful benchmarks, rigorous evaluation protocols, and stronger collaboration with domain experts to drive impactful and reliable advances in graph learning research, unlocking the potential of graph learning.