Is API Access to LLMs Useful for Generating Private Synthetic Tabular Data?
Abstract
Differentially private (DP) synthetic data is a versatile tool for enabling the analysis of private data. With the rise of foundation models, a number of new synthetic data algorithms privately finetune the weights of foundation models to improve over existing approaches to generating private synthetic data. In this work, we propose two algorithms for using API access only to generate DP tabular synthetic data. We extend the Private Evolution algorithm \citep{lin2023differentially, xie2024differentially} to the tabular data domain, define a workload-based distance measure, and propose a family of algorithms that use one-shot API access to LLMs.