Google Research

Span Recovery for Deep Neural Networks with Applications to Input Obfuscation

ICLR 2020 (2020)


The tremendous success of deep neural networks has motivated the need to better understand the fundamental properties of these networks, but many of the theoretical results proposed have only been for shallow networks. In this paper, we study an important primitive for understanding the meaningful input space of a deep network: span recovery. For k < n, let A ∈ R k×n be the innermost weight matrix of an arbitrary feed forward neural network M : R n → R, so M(x) can be written as M(x) = σ(Ax), for some network σ : R k → R. The goal is then to recover the row span of A given only oracle access to the value of M(x). We show that if M is a multi-layered network with ReLU activation functions, then partial recovery is possible: namely, we can provably recover k/2 linearly independent vectors in the row span of A using poly(n) non-adaptive queries to M(x). Furthermore, if M has differentiable activation functions, we demonstrate that full span recovery is possible even when the output is first passed through a sign or 0/1 thresholding function; in this case our algorithm is adaptive. Empirically, we confirm that full span recovery is not always possible, but only for unrealistically thin layers. For reasonably wide networks, we obtain full span recovery on both random networks and networks trained on MNIST data. Furthermore, we demonstrate the utility of span recovery as an attack by inducing neural networks to misclassify data obfuscated by controlled random noise as sensical inputs.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work