
Yurii Sushko
Yurii Sushko is leading Google’s anonymization team. Yurii’s team builds anonymization and differential privacy infrastructure and enables Google products to anonymize data and ML models.
Yurii had a PhD in bioinformatics from Technical University of Munich, where he worked on building open solutions for AI-assisted drug design.
Research Areas
Authored Publications
Sort By
Our Approach to Protecting AI Training Data
Cindy Muya
Jason Novak
Cindee Madison
Reiner Critides
Ben Kamber
Niha Vempati
Jeremy Wiesner
Google, Google, Google, 1600 Amphitheatre Parkway, Mountain View, CA, 94043 (2025) (2025)
Preview abstract
Google has over 25 years experience protecting data from inappropriate access and unauthorized use. In the era of AI, Google has extended these best practices in data protection to ensure that the right data is used the right way to train models. This paper presents a number of these best practices, describes how Google applies them in its systems, and describes how Google Cloud customers can use Google Cloud capabilities to implement these practices themselves.
Protecting data requires both technical controls to enable safe data use at scale, and governance processes to ensure that companies have visibility and control over how their data is used. This fundamentally requires: understanding data and ensuring it has sufficient metadata in the form of attributes, controlling the data and implementing policies to allow (or disallow) certain usage based on those attributes, transforming data to enable its usage in policy compliant ways, and human oversight and governance.
Protecting data in AI inherits these requirements and introduces new requirements to account for unique AI-specific risks including memorization/recitation and the costs of training foundational models. Meeting these new risks requires new capabilities including enhanced understanding of data and model lineage as well as an increased ability to control data usage through checks on data for policy compliance at the time a training job is configured before it is run.
This white paper offers an in-depth look at data protection best practices and Google’s data protection capabilities, and is one of a series of publications about Google's Secure AI Framework (SAIF). Building upon its secure development practices, Google has developed and deployed a number of capabilities to understand, control, and transform data in its infrastructure so that data is both protected and used appropriately. This involves robust annotation systems to represent metadata and enable granular understanding of data at both an item and dataset level, policy engines that evaluate machine readable policies on that data using the metadata attributes, and sensors to understand how data is flowing across Google’s systems and raise alerts when policy violations occur. Moreover, Google has developed de-identification and anonymization systems to transform data to make it policy compliant and safer to use for AI training.
View details
A General Purpose Transpiler for Fully Homomorphic Encryption
Shruthi Gorantala
Sean Purser-Haskell
Asra Ali
Eric P. Astor
Itai Zukerman
Sam Ruth
Phillipp Schoppmann
Sasha Kulankhina
Alain Forget
David Marn
Cameron Tew
Rafael Misoczki
Bernat Guillen
Xinyu Ye
Damien Desfontaines
Aishe Krishnamurthy
Miguel Guevara
Google LLC (2021)
Preview abstract
Fully homomorphic encryption (FHE) is an encryption scheme which enables computation on encrypted data without revealing the underlying data. While there have been many advances in the field of FHE, developing programs using FHE still requires expertise in cryptography. In this white paper, we present a fully homomorphic encryption transpiler that allows developers to convert high-level code (e.g., C++) that works on unencrypted data into high-level code that operates on encrypted data. Thus, our transpiler makes transformations possible on encrypted data.
Our transpiler builds on Google's open-source XLS SDK (https://github.com/google/xls) and uses an off-the-shelf FHE library, TFHE (https://tfhe.github.io/tfhe/), to perform low-level FHE operations. The transpiler design is modular, which means the underlying FHE library as well as the high-level input and output languages can vary. This modularity will help accelerate FHE research by providing an easy way to compare arbitrary programs in different FHE schemes side-by-side. We hope this lays the groundwork for eventual easy adoption of FHE by software developers. As a proof-of-concept, we are releasing an experimental transpiler (https://github.com/google/fully-homomorphic-encryption/tree/main/transpiler) as open-source software.
View details