HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardinality Estimation Algorithm

Stefan Heule; Marc Nunkesser; Alex Hall

HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardinality Estimation Algorithm

Stefan Heule

Marc Nunkesser

Alex Hall

Proceedings of the EDBT 2013 Conference, ACM, Genoa, Italy (to appear)

Google Scholar

Abstract

Cardinality estimation has a wide range of applications and
is of particular importance in database systems. Various
algorithms have been proposed in the past, and the HyperLogLog algorithm is one of them. In this paper, we
present a series of improvements to this algorithm that reduce its memory requirements and significantly increase its
accuracy for an important range of cardinalities. We have
implemented our proposed algorithm for a system at Google
and evaluated it empirically, comparing it to the original
HyperLogLog algorithm. Like HyperLogLog, our improved algorithm parallelizes perfectly and computes the
cardinality estimate in a single pass.

Research Areas

Algorithms and theory

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardinality Estimation Algorithm

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs