Thesis Open Access

Improving Generalization of Deep Learning Music Classifiers

Morgan Buisson

Citation Style Language JSON Export

  "publisher": "Zenodo", 
  "DOI": "10.5281/zenodo.5554754", 
  "language": "eng", 
  "title": "Improving Generalization of Deep Learning Music Classifiers", 
  "issued": {
    "date-parts": [
  "abstract": "<p>Deep learning models have recently led to significant improvements in a wide variety of tasks. Known as being a very powerful tool capable of generalizing better than traditional machine learning approaches, deep learning models still heavily rely on large quantities of annotated data. As the field of music information retrieval is still subject to data sparsity, automatic music classification remains a challenging problem and numerous models fail at generalizing to out-of-distribution music col-lections. This project investigates possible directions to follow in order to improve the generalization capacity of deep learning music classifiers. More specifically, we suggest a set of guidelines to be followed in order to address the generalization problem of music classifiers trained on very small datasets. We first propose ways to maximize the amount of information extracted from small datasets through outliers detection and e&yuml;cient audio data augmentation. We then show that considering the amount of perceptual ambiguity of each classification task through label smoothing can help obtaining more generalizable classification bounds. We also highlight the impact label noise can have in a small dataset setting and explore ways to improve the model&rsquo;s robustness. Finally, we argue that leveraging common knowledge among related classification tasks can result in a more generalizable internal representation learned by the model. To illustrate this assumption, we employ a simple multi-task learning architecture to jointly learn pairs of tasks, and list other interesting axes to be further explored in that direction. All the suggested approaches are exper-imentally assessed on two state-of-the-art CNN architectures for automatic music classification. They all lead to consistent improvements over baseline models and unveil new relevant questions to rethink the task of automatic music classification.</p>", 
  "author": [
      "family": "Morgan Buisson"
  "type": "thesis", 
  "id": "5554754"
All versions This version
Views 8080
Downloads 5757
Data volume 171.0 MB171.0 MB
Unique views 7373
Unique downloads 5555


Cite as