Adversarial Reprogramming of Neural Networks

Gamaleldin Fathy Elsayed; Jascha Sohl-dickstein; Ian Goodfellow

Adversarial Reprogramming of Neural Networks

Gamaleldin Fathy Elsayed

Jascha Sohl-dickstein

Ian Goodfellow

ICLR (2019)

Download Google Scholar

Abstract

Deep neural networks are susceptible to adversarial attacks. In computer vision,
well-crafted perturbations to images can cause neural networks to make mistakes
such as confusing a cat with a computer. Previous adversarial attacks have been
designed to degrade performance of models or cause machine learning models
to produce specific outputs chosen ahead of time by the attacker. We introduce
attacks that instead reprogram the target model to perform a task chosen by the
attacker—without the attacker needing to specify or compute the desired output
for each test-time input. This attack finds a single adversarial perturbation, that
can be added to all test-time inputs to a machine learning model in order to cause
the model to perform a task chosen by the adversary—even if the model was not
trained to do this task. These perturbations can thus be considered a program
for the new task. We demonstrate adversarial reprogramming on six ImageNet
classification models, repurposing these models to perform a counting task, as well
as classification tasks: classification of MNIST and CIFAR-10 examples presented
as inputs to the ImageNet model.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Adversarial Reprogramming of Neural Networks

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs