Thesis Open Access

From heuristics-based to data-driven audio melody extraction

Bosch, Juan J.


JSON Export

{
  "files": [
    {
      "links": {
        "self": "https://zenodo.org/api/files/b9152a6d-8f1d-4a15-ba9b-2dea82df99c3/phdthesis_bosch.pdf"
      }, 
      "checksum": "md5:7bdca96e9b76245f494abdc46a51ef47", 
      "bucket": "b9152a6d-8f1d-4a15-ba9b-2dea82df99c3", 
      "key": "phdthesis_bosch.pdf", 
      "type": "pdf", 
      "size": 10313726
    }
  ], 
  "owners": [
    26613
  ], 
  "doi": "10.5281/zenodo.1120334", 
  "stats": {
    "version_unique_downloads": 111.0, 
    "unique_views": 107.0, 
    "views": 128.0, 
    "version_views": 128.0, 
    "unique_downloads": 111.0, 
    "version_unique_views": 107.0, 
    "volume": 1278902024.0, 
    "version_downloads": 124.0, 
    "downloads": 124.0, 
    "version_volume": 1278902024.0
  }, 
  "links": {
    "doi": "https://doi.org/10.5281/zenodo.1120334", 
    "conceptdoi": "https://doi.org/10.5281/zenodo.1120333", 
    "bucket": "https://zenodo.org/api/files/b9152a6d-8f1d-4a15-ba9b-2dea82df99c3", 
    "conceptbadge": "https://zenodo.org/badge/doi/10.5281/zenodo.1120333.svg", 
    "html": "https://zenodo.org/record/1120334", 
    "latest_html": "https://zenodo.org/record/1120334", 
    "badge": "https://zenodo.org/badge/doi/10.5281/zenodo.1120334.svg", 
    "latest": "https://zenodo.org/api/records/1120334"
  }, 
  "conceptdoi": "10.5281/zenodo.1120333", 
  "created": "2017-12-20T17:05:58.520299+00:00", 
  "updated": "2020-03-19T11:33:00.868403+00:00", 
  "conceptrecid": "1120333", 
  "revision": 10, 
  "id": 1120334, 
  "metadata": {
    "access_right_category": "success", 
    "doi": "10.5281/zenodo.1120334", 
    "description": "<p><strong>Abstract</strong></p>\n\n<p>The identification of the melody from a music recording is a relatively easy task for humans, but very challenging for computational systems. This task is known as &quot;audio melody extraction&quot;, more formally defined as the automatic estimation of the pitch sequence of the melody directly from the audio signal of a polyphonic music recording. This thesis investigates the benefits of exploiting knowledge automatically derived from data for audio melody extraction, by combining digital signal&nbsp;processing and machine learning methods. We extend the scope of melody extraction research by working with a varied dataset and multiple definitions of melody. We first present an overview of the state of the art, and perform an evaluation focused on a novel symphonic music dataset. We then propose melody extraction methods based on a source-filter model and pitch contour characterisation and evaluate them on a wide range of music genres. Finally, we explore novel timbre, tonal and spatial features for contour characterisation, and propose a method for estimating multiple melodic lines. The combination of supervised and unsupervised approaches leads to advancements on melody extraction and shows a promising path for future research and applications.</p>\n\n<p>&nbsp;</p>\n\n<p><strong>Datasets:&nbsp;</strong><br>\n<br>\nThe symphonic music dataset proposed in this thesis (Orchset) is available at:</p>\n\n<p><a href=\"https://zenodo.org/record/1289786#.XnNV15P0mL8\">https://zenodo.org/record/1289786#.XnNV15P0mL8</a></p>\n\n<p>Orchset is intended to be used as a dataset for the development and evaluation of melody extraction algorithms. This collection contains 64 audio excerpts focused on symphonic music. with their corresponding annotation of the melody.</p>\n\n<p><strong>Code:</strong></p>\n\n<p>The source code of the melody extraction algorithms proposed in this thesis is available at:</p>\n\n<p><a href=\"https://github.com/juanjobosch/SourceFilterContoursMelody\">https://github.com/juanjobosch/SourceFilterContoursMelody</a></p>", 
    "language": "eng", 
    "title": "From heuristics-based to data-driven audio melody extraction", 
    "license": {
      "id": "CC-BY-NC-ND-4.0"
    }, 
    "relations": {
      "version": [
        {
          "count": 1, 
          "index": 0, 
          "parent": {
            "pid_type": "recid", 
            "pid_value": "1120333"
          }, 
          "is_last": true, 
          "last_child": {
            "pid_type": "recid", 
            "pid_value": "1120334"
          }
        }
      ]
    }, 
    "communities": [
      {
        "id": "mdm-dtic-upf"
      }, 
      {
        "id": "mir"
      }
    ], 
    "thesis": {
      "university": "Universitat Pompeu Fabra, Barcelona", 
      "supervisors": [
        {
          "affiliation": "Universitat Pompeu Fabra, Barcelona", 
          "name": "G\u00f3mez, Emilia"
        }
      ]
    }, 
    "keywords": [
      "Melody Extraction", 
      "Automatic", 
      "MIR", 
      "Music", 
      "Retrieval", 
      "Symphonic", 
      "Instrument", 
      "Agreement", 
      "Tonality", 
      "Timbre", 
      "Stereo", 
      "Source-filter", 
      "Separation", 
      "NMF", 
      "Visualisation", 
      "Evaluation", 
      "Dataset", 
      "Contour", 
      "Salience", 
      "Pitch", 
      "Supervised"
    ], 
    "publication_date": "2017-06-27", 
    "creators": [
      {
        "orcid": "0000-0003-4221-3517", 
        "affiliation": "Universitat Pompeu Fabra, Barcelona", 
        "name": "Bosch, Juan J."
      }
    ], 
    "access_right": "open", 
    "resource_type": {
      "subtype": "thesis", 
      "type": "publication", 
      "title": "Thesis"
    }, 
    "related_identifiers": [
      {
        "scheme": "url", 
        "identifier": "http://mtg.upf.edu/node/3737", 
        "relation": "isIdenticalTo"
      }, 
      {
        "scheme": "doi", 
        "identifier": "10.5281/zenodo.1120333", 
        "relation": "isVersionOf"
      }
    ]
  }
}
128
124
views
downloads
All versions This version
Views 128128
Downloads 124124
Data volume 1.3 GB1.3 GB
Unique views 107107
Unique downloads 111111

Share

Cite as