Privacy-Preserving Secure Cardinality and Frequency Estimation

Benjamin Kreuter
Raimundo Mirisola
Yao Wang
Google, LLC (2020)

Abstract

In this paper we introduce a new family of methods for cardinality and
frequency estimation. These methods combine aspects of HyperLogLog
(HLL) and Bloom filters in order to build a sketch that, like HLL, is
substantially more compact than a Bloom filter, but like a Bloom filter
maintains the ability to union sketches with a bucket-wise sum. Together
these properties enable the creation of a scalable secure multi-party computation protocol that takes advantage of homomorphic encryption to
combine sketches across multiple untrusted parties. The protocol limits
the amount of information that participants learn to differentially private
estimates of the union of sketches and some partial information about the
Venn diagram of the per-sketch cardinalities.