Google Research

Transfer Learning In MIR: Sharing Learned Latent Representations For Music Audio Classification And Similarity

14th International Conference on Music Information Retrieval (ISMIR '13) (2013)


This paper discusses the concept of transfer learning and its potential applications to MIR tasks such as music audio classification and similarity. In a traditional supervised machine learning setting, a system can only use labeled data from a single dataset to solve a given task. The labels associated with the dataset define the nature of the task to solve. A key advantage of transfer learning is in leveraging knowledge from related tasks to improve performance on a given target task. One way to transfer knowledge is to learn a shared latent representation across related tasks. This method has shown to be beneficial in many domains of machine learning, but has yet to be explored in MIR. Many MIR datasets for audio classification present a semantic overlap in their labels. Furthermore, these datasets often contain relatively few songs. Thus, there is a strong case for exploring methods to share knowledge between these datasets towards a more general and robust understanding of high level musical concepts such as genre and similarity. Our results show that shared representations can improve classification accuracy. We also show how transfer learning can improve performance for music similarity.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work