Christopher A. Choquette-Choo
I am a Research Scientist in Google Deepmind on the Privacy and Security team.
I work on privacy-preserving machine learning, memorization of large language models, and various other topics at the intersection of machine learning with privacy and security.
For more detail, see my personal website: https://www.christopherchoquette.com
Authored Publications
Sort By
Federated Learning of Gboard Language Models with Differential Privacy
Yanxiang Zhang
Galen Andrew
Jesse Rosenstock
Yuanbo Zhang
ACL industry track (2023) (to appear)
Preview abstract
We train language models (LMs) with federated learning (FL) and differential privacy (DP) in the Google Keyboard (Gboard). We apply the DP-Follow-the-Regularized-Leader (DP-FTRL)~\citep{kairouz21b} algorithm to achieve meaningfully formal DP guarantees without requiring uniform sampling of client devices.
To provide favorable privacy-utility trade-offs, we introduce a new client participation criterion and discuss the implication of its configuration in large scale systems. We show how quantile-based clip estimation~\citep{andrew2019differentially} can be combined with DP-FTRL to adaptively choose the clip norm during training or reduce the hyperparameter tuning in preparation for training.
With the help of pretraining on public data, we train and deploy more than twenty Gboard LMs that achieve high utility and $\rho-$zCDP privacy guarantees with $\rho \in (0.2, 2)$, with two models additionally trained with secure aggregation~\citep{bonawitz2017practical}.
We are happy to announce that all the next word prediction neural network LMs in Gboard now have DP guarantees, and all future launches of Gboard neural network LMs will require DP guarantees.
We summarize our experience and provide concrete suggestions on DP training for practitioners.
View details