Differentially Private SQL with Bounded User Contribution

Celia Yuxin Zhang
Damien Desfontaines
Daniel Simmons-Marengo
Proceedings on Privacy Enhancing Technologies Symposium (2020) (to appear)

Abstract

Differential privacy (DP) provides a theoretical promise to users and analysts limiting the ability to determine a user’s contribution (if any) to the results of analysis. While there have been many theoretical explorations into the design of DP algorithms, few generically practical implementations of end-to-end DP engines exist. This paper presents a practical SQL-based engine that provides privacy guarantees with respect to groups of records, possibly spanning multiple tables, owned by a single entity. To date there has been little work to provide this type of protection for multiple rows in the same table or joins more generally. The engine utilizes a novel algorithm that evaluates query aggregations using a two-step process to enforce DP per owning entity. We limit the query sensitivity impact of joins by restricting and propagating a row-owner identifier at all steps, which allows us to limit row-owner contribution. For testing, we present a semi-decidable stochastic model-checking system, used to ensure privacy for the engine’s full range of statistical functions. This model provides stronger guarantees on privacy than existing systems with comparable accuracy. The result is a general purpose SQL engine, capable of answering typical analysis questions with little or no modification to existing queries.