Philippe Hamel

Philippe Hamel

Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Transfer Learning In MIR: Sharing Learned Latent Representations For Music Audio Classification And Similarity
    Matthew E. P. Davies
    Kazuyoshi Yoshii
    Masataka Goto
    14th International Conference on Music Information Retrieval (ISMIR '13) (2013)
    Preview abstract This paper discusses the concept of transfer learning and its potential applications to MIR tasks such as music audio classification and similarity. In a traditional supervised machine learning setting, a system can only use labeled data from a single dataset to solve a given task. The labels associated with the dataset define the nature of the task to solve. A key advantage of transfer learning is in leveraging knowledge from related tasks to improve performance on a given target task. One way to transfer knowledge is to learn a shared latent representation across related tasks. This method has shown to be beneficial in many domains of machine learning, but has yet to be explored in MIR. Many MIR datasets for audio classification present a semantic overlap in their labels. Furthermore, these datasets often contain relatively few songs. Thus, there is a strong case for exploring methods to share knowledge between these datasets towards a more general and robust understanding of high level musical concepts such as genre and similarity. Our results show that shared representations can improve classification accuracy. We also show how transfer learning can improve performance for music similarity. View details
    Building Musically-relevant Audio Features through Multiple Timescale Representations
    Yoshua Bengio
    Proceedings of the 13th International Society for Music Information Retrieval Conference, Porto, Portugal (2012)
    Preview
    Preview abstract Music prediction tasks range from predicting tags given a song or clip of audio, predicting the name of the artist, or predicting related songs given a song, clip, artist name or tag. That is, we are interested in every semantic relationship between the different musical concepts in our database. In realistically sized databases, the number of songs is measured in the hundreds of thousands or more, and the number of artists in the tens of thousands or more, providing a considerable challenge to standard machine learning techniques. In this work, we propose a method that scales to such datasets which attempts to capture the semantic similarities between the database items by modeling audio, artist names, and tags in a single low-dimensional semantic embedding space. This choice of space is learnt by optimizing the set of prediction tasks of interest jointly using multi-task learning. Our single model learnt by training on the joint objective function is shown experimentally to have improved accuracy over training on each task alone. Our method also outperforms the baseline methods tried and, in comparison to them, is faster and consumes less memory. We also demonstrate how our method learns an interpretable model, where the semantic space captures well the similarities of interest. View details
    Temporal pooling and multiscale learning for automatic annotation and ranking of music audio
    Simon Lemieux
    Yoshua Bengio
    International Society for Music Information Retrieval (ISMIR 2011)
    Preview