What does it mean for a language model to preserve privacy?

Fatemehsadat Mireshghallah
Florian Tramèr
Hannah Brown
Reza Shokri
FaCCT (2022)
Google Scholar

Abstract

Our language reflects who we are. The words and phrases we use as well as the contextual information in our conversations disclose
our personal life. As humans we learn how to communicate about ourselves and others, while delicately concealing private information
depending on the context of conversations. Language models, however, totally lack the ability to understand the context and analyze
the sensitivity of text, and tend to memorize phrases and remember information about their training sets. Thus, inference attacks are
shown to be alarmingly successful at extracting sensitive data from language models. In this paper, we discuss the privacy expectations
from language models, and provide a critical analysis of major data protection techniques: data redaction (scrubbing) and differential
privacy. We show that these protection methods can guarantee, at best, a very limited form of privacy which does not account for
correlations and other nuances in human communication. We finally argue that language models need to be trained on data which is
intended to be produced for public use with proper consent forms and authorization from authors.