Google Research

Audio Deepdream: Optimizing raw audio with convolutional networks

International Society for Music Information Retrieval Conference, Google Brain (2016)


The hallucinatory images of DeepDream opened up the floodgates for a recent wave of artwork generated by neural networks. In this work, we take first steps to applying this to audio. We believe a key to solving this problem is training a deep neural network to perform a music perception task on raw audio. Consequently, we have followed in the footsteps of Van den Oord et al and trained a network to predict embeddings that were themselves the result of a collaborative filtering model. A key difference is that we learn features directly from the raw audio, which creates a chain of differentiable functions from raw audio to high level features. We then use gradient descent on the network to extract samples of "dreamed" audio.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work