Craig Wright
Authored Publications
Sort By
Privacy-centric Cross-publisher Reach and Frequency Estimation Via Vector of Counts
Jason Frye
Jiayu Peng
Jim Koehler
Joseph Goodknight Knightbrook
Laura Book
Michael Daub
Scott Schneider
Sheng Ma
Xichen Huang
Ying Liu
Yunwen Yang
Preston Lee
Google Inc. (2021)
Preview abstract
Reach and frequency are two of the most important metrics in advertising management. Ads are distributed to different publishers with a hope to maximize the reach at effective frequency. Reliable cross-publisher reach and frequency measurement is called for, to assess the actual ROI of branding and to improve the budget allocation strategy. However, cross-publisher measurement is non-trial under the strict privacy restriction.
This paper introduces the first locally differential private solution in the literature to cross-publisher reach and frequency estimation. The solution consists of a family of algorithms based on a data structure called Vector of Counts (VoC). Complying the standard of differential privacy, the solution prevents attackers from telling if any user is reached or not with enough confidence. The solution enjoys particularly high accuracy for the estimation of two publishers. For more than two publishers, the solution does a careful bias-variance trade-off. It enjoys small variance, at a risk of having bias in the presence of cross-publisher correlation of user activity.
View details
Preview abstract
This document describes a secure mechanism to join sets of heterogeneous ids from multiple data providers (e.g. broadcasters, publishers, data enrichment providers) and create a set of encrypted common identifiers, which we refer to as SUMIDs. These identifiers can be used for computing multi-party reach, frequency, sales lift, MTA, and other ads metrics. We also introduce the concept of “match rules”, which dictate when two heterogeneous IDs should be assigned the same SUMID within the secure mechanism. We avoid prescribing specific match rules as these could vary depending upon a number of considerations, such as the specific media market or whether the SUMID is intended to represent a household or an individual. Optimal choice of match rules are also an open area of research.
View details
A System Design for Privacy-Preserving Reach and Frequency Estimation
Eli Fox-Epstein
Jason Frye
Mark Alois Fashing
Raimundo Mirisola
Yao Wang
Yunus Yenigor
Google, LLC (2020)
Preview abstract
This document describes the high level system design of an MPC-based approach to the Private Reach and Frequency Estimation. The MPC protocol was previously described in Privacy Preserving Secure Cardinality and Frequency Estimation [1], and although several modifications are forthcoming, they do not impact the overall system design.
View details
Privacy-Preserving Secure Cardinality and Frequency Estimation
Benjamin Kreuter
Raimundo Mirisola
Yao Wang
Google, LLC (2020)
Preview abstract
In this paper we introduce a new family of methods for cardinality and
frequency estimation. These methods combine aspects of HyperLogLog
(HLL) and Bloom filters in order to build a sketch that, like HLL, is
substantially more compact than a Bloom filter, but like a Bloom filter
maintains the ability to union sketches with a bucket-wise sum. Together
these properties enable the creation of a scalable secure multi-party computation protocol that takes advantage of homomorphic encryption to
combine sketches across multiple untrusted parties. The protocol limits
the amount of information that participants learn to differentially private
estimates of the union of sketches and some partial information about the
Venn diagram of the per-sketch cardinalities.
View details