Google Research

Semantically Meaningful Attributes from Cowatch Embeddings for Playlist Exploration and Expansion

International Society for Music Information Retrieval Conference (2020)


Audio embeddings for musical similarity are often used for autoplay discovery. These embeddings are typically learned using co-listen data to train a deep neural network, to provide consistent triplet-loss distances. Instead of directly using the co-listen–based embeddings, we create an embedding space by training classifiers for attributes that describe music in human terms. This attribute-embedding space allows us to we provide recommendations, for use by music curators, that are less likely to be completely unintelligible. Each attribute used in this embedding space is built on top of the co-listen–based embeddings, sometimes with additional inputs for other meta-data. We examine the relative performance of these two embedding spaces (the co-listen audio embedding and the attribute embedding) for the mathematical separation of thematic playlists. We also report on the usefulness of recommendations from the attribute-embedding space to human curators.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work