- Adam Roberts
- Cinjon Resnick
- Diego Ardila
- Doug Eck
Abstract
The hallucinatory images of DeepDream opened up the floodgates for a recent wave of artwork generated by neural networks. In this work, we take first steps to applying this to audio. We believe a key to solving this problem is training a deep neural network to perform a music perception task on raw audio. Consequently, we have followed in the footsteps of Van den Oord et al and trained a network to predict embeddings that were themselves the result of a collaborative filtering model. A key difference is that we learn features directly from the raw audio, which creates a chain of differentiable functions from raw audio to high level features. We then use gradient descent on the network to extract samples of "dreamed" audio.
Research Areas
Learn more about how we do research
We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work