Conference paper Open Access

Using weakly aligned score–audio pairs to train deep chroma models for cross-modal music retrieval

Frank Zalkow; Meinard Müller


Citation Style Language JSON Export

{
  "publisher": "ISMIR", 
  "DOI": "10.5281/zenodo.4245400", 
  "container_title": "Proceedings of the 21st International Society for Music Information Retrieval Conference", 
  "title": "Using weakly aligned score\u2013audio pairs to train deep chroma models for cross-modal music retrieval", 
  "issued": {
    "date-parts": [
      [
        2020, 
        10, 
        11
      ]
    ]
  }, 
  "abstract": "Many music information retrieval tasks involve the comparison of a symbolic score representation with an audio recording. A typical strategy is to compare score\u2013audio pairs based on a common mid-level representation, such as chroma features. Several recent studies demonstrated the effectiveness of deep learning models that learn task-specific mid-level representations from temporally aligned training pairs. However, in practice, there is often a lack of strongly aligned training data, in particular for real-world scenarios. In our study, we use weakly aligned score\u2013audio pairs for training, where only the beginning and end of a score excerpt is annotated in an audio recording, without aligned correspondences in between. To exploit such weakly aligned data, we employ the Connectionist Temporal Classification (CTC) loss to train a deep learning model for computing an enhanced chroma representation. We then apply this model to a cross-modal retrieval task, where we aim at finding relevant audio recordings of Western classical music, given a short monophonic musical theme in symbolic notation as a query. We present systematic experiments that show the effectiveness of the CTC-based model for this theme-based retrieval task.", 
  "author": [
    {
      "family": "Frank Zalkow"
    }, 
    {
      "family": "Meinard M\u00fcller"
    }
  ], 
  "id": "4245400", 
  "event-place": "Montreal, Canada", 
  "publisher_place": "Montreal, Canada", 
  "type": "paper-conference", 
  "event": "International Society for Music Information Retrieval Conference (ISMIR 2020)", 
  "page": "184-191"
}
128
51
views
downloads
All versions This version
Views 128128
Downloads 5151
Data volume 42.4 MB42.4 MB
Unique views 115115
Unique downloads 4545

Share

Cite as