Philippe Hamel
Research Areas
Authored Publications
Sort By
To Have a Tiger by the Tail: Improving Music Recommendation for International Users
Preview
Machine Learning for Music Discovery Workshop, ICML 2015
Transfer Learning In MIR: Sharing Learned Latent Representations For Music Audio Classification And Similarity
Matthew E. P. Davies
Kazuyoshi Yoshii
Masataka Goto
14th International Conference on Music Information Retrieval (ISMIR '13) (2013)
Preview abstract
This paper discusses the concept of transfer learning and its potential applications to MIR tasks such as music audio classification and similarity. In a traditional supervised machine learning setting, a system can only use labeled data from a single dataset to solve a given task. The labels associated with the dataset define the nature of the task to solve. A key advantage of transfer learning is in leveraging knowledge from related tasks to improve performance on a given target task. One way to transfer knowledge is to learn a shared latent representation across related tasks. This method has shown to be beneficial in many domains of machine learning, but has yet to be explored in MIR. Many MIR datasets for audio classification present a semantic overlap in their labels. Furthermore, these datasets often contain relatively few songs. Thus, there is a strong case for exploring methods to share knowledge between these datasets towards a more general and robust understanding of high level musical concepts such as genre and similarity. Our results show that shared representations can improve classification accuracy. We also show how transfer learning can improve performance for music similarity.
View details
Building Musically-relevant Audio Features through Multiple Timescale Representations
Preview
Yoshua Bengio
Proceedings of the 13th International Society for Music Information Retrieval Conference, Porto, Portugal (2012)
Preview abstract
Music prediction tasks range from predicting tags given a song or clip of audio, predicting the name of the artist, or predicting related songs given a song, clip, artist name or tag. That is, we are interested in every semantic relationship between the different musical concepts in our database. In realistically sized databases, the number of songs is measured in the hundreds of thousands or more, and the number of artists in the tens of thousands or more, providing a considerable challenge to standard machine learning techniques. In this work, we propose a method that scales to such datasets which attempts to capture the semantic similarities between the database items by modeling audio, artist names, and tags in a single low-dimensional semantic embedding space. This choice of space is learnt by optimizing the set of prediction tasks of interest jointly using multi-task learning. Our single model learnt by training on the joint objective function is shown experimentally to have improved accuracy over training on each task alone. Our method also outperforms the baseline methods tried and, in comparison to them, is faster and consumes less memory. We also demonstrate how our method learns an interpretable model, where the semantic space captures well the similarities of interest.
View details
Temporal pooling and multiscale learning for automatic annotation and ranking of music audio
Preview
Simon Lemieux
Yoshua Bengio
International Society for Music Information Retrieval (ISMIR 2011)