A Method to Reveal Speaker Identity in Distributed ASR Training,and How to Counter It

Trung Dang; Om Thakkar; Swaroop Ramaswamy; Rajiv Mathews; Peter Chin; Françoise Simone Beaufays

A Method to Reveal Speaker Identity in Distributed ASR Training,and How to Counter It

Trung Dang

Om Thakkar

Swaroop Ramaswamy

Rajiv Mathews

Peter Chin

Françoise Simone Beaufays

IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2022, Virtual and Singapore, 23-27 May 2022, {IEEE}, pp. 4338-4342

Download Google Scholar

Abstract

End-to-end Automatic Speech Recognition (ASR) models are commonly trained over spoken utterances using optimization methods like Stochastic Gradient Descent (SGD). In distributed settings like Federated Learning, model training requires transmission of gradients over a network. In this work, we design the first method for revealing the identity of the speaker of a training utterance with access only to a gradient. We propose Hessian-Free Gradients Matching, an input reconstruction technique that operates without second derivatives of the loss function (required in prior works), which can be expensive to compute. We show the effectiveness of our method using the DeepSpeech model architecture, demonstrating that it is possible to reveal the speaker’s identity with 34% top-1 accuracy (51% top-5 accuracy) on the LibriSpeech dataset. Further, we study the effect of Dropout on the success of our method. We show that a dropout rate of 0.2 can reduce the speaker identity accuracy to 0% top-1 (0.5% top-5).

Research Areas

Anti abuse

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

A Method to Reveal Speaker Identity in Distributed ASR Training,and How to Counter It

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs