Irippuge Milinda Perera
Irippuge Milinda Perera is a Software Engineer in the Security & Privacy group at Google. He received his Ph.D. in Computer Science from the City University of New York (CUNY) in 2015. His research interests are in data anonymization, mobile security, steganography, and cryptography.
Research Areas
Authored Publications
Sort By
Google COVID-19 Vaccination Search Insights: Anonymization Process Description
Adam Boulanger
Akim Kumok
Arti Patankar
Benjamin Miller
Chaitanya Kamath
Charlotte Stanton
Chris Scott
Damien Desfontaines
Evgeniy Gabrilovich
Gregory A. Wellenius
John S. Davis
Karen Lee Smith
Krishna Kumar Gadepalli
Mark Young
Shailesh Bavadekar
Tague Griffith
Yael Mayer
Arxiv.org (2021)
Preview abstract
This report describes the aggregation and anonymization process applied to the COVID-19 Vaccination Search Insights~\cite{vaccination}, a publicly available dataset showing aggregated and anonymized trends in Google searches related to COVID-19 vaccination. The applied anonymization techniques protect every user’s daily search activity related to COVID-19 vaccinations with $(\varepsilon, \delta)$-differential privacy for $\varepsilon = 2.19$ and $\delta = 10^{-5}$.
View details
A General Purpose Transpiler for Fully Homomorphic Encryption
Shruthi Gorantala
Rob Springer
Sean Purser-Haskell
Asra Ali
Eric P. Astor
Itai Zukerman
Sam Ruth
Phillipp Schoppmann
Sasha Kulankhina
Alain Forget
David Marn
Cameron Tew
Rafael Misoczki
Bernat Guillen
Xinyu Ye
Damien Desfontaines
Aishe Krishnamurthy
Miguel Guevara
Yurii Sushko
Google LLC (2021)
Preview abstract
Fully homomorphic encryption (FHE) is an encryption scheme which enables computation on encrypted data without revealing the underlying data. While there have been many advances in the field of FHE, developing programs using FHE still requires expertise in cryptography. In this white paper, we present a fully homomorphic encryption transpiler that allows developers to convert high-level code (e.g., C++) that works on unencrypted data into high-level code that operates on encrypted data. Thus, our transpiler makes transformations possible on encrypted data.
Our transpiler builds on Google's open-source XLS SDK (https://github.com/google/xls) and uses an off-the-shelf FHE library, TFHE (https://tfhe.github.io/tfhe/), to perform low-level FHE operations. The transpiler design is modular, which means the underlying FHE library as well as the high-level input and output languages can vary. This modularity will help accelerate FHE research by providing an easy way to compare arbitrary programs in different FHE schemes side-by-side. We hope this lays the groundwork for eventual easy adoption of FHE by software developers. As a proof-of-concept, we are releasing an experimental transpiler (https://github.com/google/fully-homomorphic-encryption/tree/main/transpiler) as open-source software.
View details
Google COVID-19 Search Trends Symptoms Dataset: Anonymization Process Description
Akim Kumok
Chaitanya Kamath
Charlotte Stanton
Damien Desfontaines
Evgeniy Gabrilovich
Gerardo Flores
Gregory Alexander Wellenius
Ilya Eckstein
John S. Davis
Katie Everett
Krishna Kumar Gadepalli
Rayman Huang
Shailesh Bavadekar
Thomas Ludwig Roessler
Venky Ramachandran
Yael Mayer
Arxiv.org, N/A (2020)
Preview abstract
This report describes the aggregation and anonymization process applied to the initial version of COVID-19 Search Trends symptoms dataset, a publicly available dataset that shows aggregated, anonymized trends in Google searches for symptoms (and some related topics). The anonymization process is designed to protect the daily search activity of every user with \varepsilon-differential privacy for \varepsilon = 1.68.
View details
KHyperLogLog: Estimating Reidentifiability and Joinability of Large Data at Scale
Pern Hui Chia
Damien Desfontaines
Daniel Simmons-Marengo
Chao Li
Wei-Yen Day
Qiushi Wang
Miguel Guevara
Proceedings of the 2019 IEEE Symposium on Security and Privacy
Preview abstract
Understanding the privacy relevant characteristics of data sets, such as reidentifiability and joinability, is crucial for data governance, yet can be difficult for large data sets. While computing the data characteristics by brute force is straightforward, the scale of systems and data collected by large organizations demands an efficient approach. We present KHyperLogLog (KHLL), an algorithm based on approximate counting techniques that can estimate the reidentifiability and joinability risks of very large databases using linear runtime and minimal memory. KHLL enables one to measure reidentifiability of data quantitatively, rather than based on expert judgement or manual reviews. Meanwhile, joinability analysis using KHLL helps ensure the separation of pseudonymous and identified data sets. We describe how organizations can use KHLL to improve protection of user privacy. The efficiency of KHLL allows one to schedule periodic analyses that detect any deviations from the expected risks over time as a regression test for privacy. We validate the performance and accuracy of KHLL through experiments using proprietary and publicly available data sets
View details