There is a newer version of this record available.

Video/Audio Open Access

DESED_synthetic

Turpault, Nicolas; Serizel, Romain


JSON Export

{
  "files": [
    {
      "links": {
        "self": "https://zenodo.org/api/files/7c474126-75e3-4de3-b308-f7a07c9fbbdb/DESED_synth_dcase2019jams.tar.gz"
      }, 
      "checksum": "md5:16053563caa8e761eec043f44255d96f", 
      "bucket": "7c474126-75e3-4de3-b308-f7a07c9fbbdb", 
      "key": "DESED_synth_dcase2019jams.tar.gz", 
      "type": "gz", 
      "size": 2888312
    }, 
    {
      "links": {
        "self": "https://zenodo.org/api/files/7c474126-75e3-4de3-b308-f7a07c9fbbdb/DESED_synth_dcase20_train_jams.tar.gz"
      }, 
      "checksum": "md5:efe1d117cf3ff7a96341ca565f5ccf4c", 
      "bucket": "7c474126-75e3-4de3-b308-f7a07c9fbbdb", 
      "key": "DESED_synth_dcase20_train_jams.tar.gz", 
      "type": "gz", 
      "size": 1113266
    }, 
    {
      "links": {
        "self": "https://zenodo.org/api/files/7c474126-75e3-4de3-b308-f7a07c9fbbdb/DESED_synth_eval_dcase2019.tar.gz"
      }, 
      "checksum": "md5:e1aad0a714bb98d2b58f3d62122077b8", 
      "bucket": "7c474126-75e3-4de3-b308-f7a07c9fbbdb", 
      "key": "DESED_synth_eval_dcase2019.tar.gz", 
      "type": "gz", 
      "size": 7710291574
    }, 
    {
      "links": {
        "self": "https://zenodo.org/api/files/7c474126-75e3-4de3-b308-f7a07c9fbbdb/DESED_synth_soundbank.tar.gz"
      }, 
      "checksum": "md5:f061ed2322be7d338ab5f9628750ce6a", 
      "bucket": "7c474126-75e3-4de3-b308-f7a07c9fbbdb", 
      "key": "DESED_synth_soundbank.tar.gz", 
      "type": "gz", 
      "size": 2423131154
    }, 
    {
      "links": {
        "self": "https://zenodo.org/api/files/7c474126-75e3-4de3-b308-f7a07c9fbbdb/DESED_synth_source.tar.gz"
      }, 
      "checksum": "md5:b1d91688a56740c9ec63afc3fc769c46", 
      "bucket": "7c474126-75e3-4de3-b308-f7a07c9fbbdb", 
      "key": "DESED_synth_source.tar.gz", 
      "type": "gz", 
      "size": 27706
    }
  ], 
  "owners": [
    62138
  ], 
  "doi": "10.5281/zenodo.3713328", 
  "stats": {
    "version_unique_downloads": 4202.0, 
    "unique_views": 329.0, 
    "views": 392.0, 
    "version_views": 2377.0, 
    "unique_downloads": 1191.0, 
    "version_unique_views": 1773.0, 
    "volume": 7252763731052.0, 
    "version_downloads": 9266.0, 
    "downloads": 3039.0, 
    "version_volume": 31355182297603.0
  }, 
  "links": {
    "doi": "https://doi.org/10.5281/zenodo.3713328", 
    "conceptdoi": "https://doi.org/10.5281/zenodo.3550598", 
    "bucket": "https://zenodo.org/api/files/7c474126-75e3-4de3-b308-f7a07c9fbbdb", 
    "conceptbadge": "https://zenodo.org/badge/doi/10.5281/zenodo.3550598.svg", 
    "html": "https://zenodo.org/record/3713328", 
    "latest_html": "https://zenodo.org/record/4569096", 
    "badge": "https://zenodo.org/badge/doi/10.5281/zenodo.3713328.svg", 
    "latest": "https://zenodo.org/api/records/4569096"
  }, 
  "conceptdoi": "10.5281/zenodo.3550598", 
  "created": "2020-03-17T18:15:31.133619+00:00", 
  "updated": "2021-02-28T21:37:33.759631+00:00", 
  "conceptrecid": "3550598", 
  "revision": 11, 
  "id": 3713328, 
  "metadata": {
    "access_right_category": "success", 
    "doi": "10.5281/zenodo.3713328", 
    "description": "<p>Link to the associated github repository: <a href=\"https://github.com/turpaultn/Desed\">https://github.com/turpaultn/Desed</a></p>\n\n<p>Link to the papers: <a href=\"https://hal.inria.fr/hal-02160855\"><em>https://hal.inria.fr/hal-02160855</em></a>,&nbsp; <a href=\"https://hal.inria.fr/hal-02355573v1\">https://hal.inria.fr/hal-02355573v1</a></p>\n\n<p>Domestic Environment Sound Event Detection (DESED).</p>\n\n<p><strong>Description</strong><br>\nThis dataset is the synthetic part of the DESED dataset. It allows creating mixtures of isolated sounds and backgrounds.</p>\n\n<p>There is the material to:</p>\n\n<ul>\n\t<li>Reproduce the DCASE 2019 task 4 synthetic dataset</li>\n\t<li>Reproduce the DCASE 2020 task 4 synthetic train dataset</li>\n\t<li>Creating new mixtures from isolated foreground sounds and background sounds.</li>\n</ul>\n\n<p><strong>Files:</strong></p>\n\n<p><strong>If you want to generate new audio mixtures yourself from the original files.</strong></p>\n\n<ol>\n\t<li><strong>DESED_synth_soundbank.tar.gz</strong> : Raw data used to generate mixtures.</li>\n\t<li><strong>DESED_synth_dcase2019jams.tar.gz</strong>: JAMS files, metadata describing how to recreate the&nbsp; dcase2019 synthetic dataset<strong> </strong></li>\n\t<li><strong>DESED_synth_dcase20_train_jams.tar: </strong>JAMS files, metadata describing how to recreate the dcase2020 synthetic train dataset</li>\n\t<li><strong>DESED_synth_source.tar.gz: </strong>src files you can find on github: <a href=\"https://github.com/turpaultn/DESED\">https://github.com/turpaultn/DESED</a> . Source files to generate dcase2019 files from soundbank or generate new ones. (code can be outdated here, recommended to go in the github repo)</li>\n</ol>\n\n<p><strong>If you simply want the evaluation synthetic dataset used in DCASE 2019 task 4.</strong></p>\n\n<ol>\n\t<li><strong>DESED_synth_eval_dcase2019.tar.gz</strong><strong> </strong>:<strong> </strong>Evaluation audio and metadata files used in dcase 2019 task 4.</li>\n</ol>\n\n<p>&nbsp;</p>\n\n<p>The mixtures are generated using Scaper (https://github.com/justinsalamon/scaper) [1].</p>\n\n<p>* Background files are extracted from SINS [2], MUSAN [3] or Youtube and have been selected because they contain a very low amount of our sound event classes.<br>\n* Foreground files are extracted from Freesound [4][5] and manually verified to check the quality and segmented to remove silences.</p>\n\n<p><strong>References</strong><br>\n[1] J. Salamon, D. MacConnell, M. Cartwright, P. Li, and J. P. Bello. Scaper: A library for soundscape synthesis and augmentation<br>\nIn IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, Oct. 2017.</p>\n\n<p>[2] Gert Dekkers, Steven Lauwereins, Bart Thoen, Mulu Weldegebreal Adhana, Henk Brouckxon, Toon van Waterschoot, Bart Vanrumste, Marian Verhelst, and Peter Karsmakers.<br>\nThe SINS database for detection of daily activities in a home environment using an acoustic sensor network.<br>\nIn Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017), 32&ndash;36. November 2017.</p>\n\n<p>[3] David Snyder and Guoguo Chen and Daniel Povey.<br>\nMUSAN: A Music, Speech, and Noise Corpus.<br>\narXiv, 1510.08484, 2015.</p>\n\n<p>[4] F. Font, G. Roma &amp; X. Serra. Freesound technical demo. In Proceedings of the 21st ACM international conference on Multimedia. ACM, 2013.<br>\n&nbsp;<br>\n[5] E. Fonseca, J. Pons, X. Favory, F. Font, D. Bogdanov, A. Ferraro, S. Oramas, A. Porter &amp; X. Serra. Freesound Datasets: A Platform for the Creation of Open Audio Datasets.<br>\nIn Proceedings of the 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017.</p>\n\n<p>&nbsp;</p>", 
    "contributors": [
      {
        "affiliation": "Adobe Research, San Francisco CA, United States", 
        "type": "Researcher", 
        "name": "Salamon, Justin"
      }, 
      {
        "affiliation": "Language Technologies Institute, Carnegie Mellon University, Pittsburgh PA, United States", 
        "type": "Researcher", 
        "name": "Shah, Ankit"
      }, 
      {
        "affiliation": "Google, Inc", 
        "type": "Researcher", 
        "name": "Wisdom, Scott"
      }, 
      {
        "affiliation": "Google, Inc", 
        "type": "Researcher", 
        "name": "Hershey, John"
      }, 
      {
        "affiliation": "Google, Inc", 
        "type": "Researcher", 
        "name": "Erdogan, Hakan"
      }
    ], 
    "title": "DESED_synthetic", 
    "license": {
      "id": "CC-BY-4.0"
    }, 
    "relations": {
      "version": [
        {
          "count": 20, 
          "index": 10, 
          "parent": {
            "pid_type": "recid", 
            "pid_value": "3550598"
          }, 
          "is_last": false, 
          "last_child": {
            "pid_type": "recid", 
            "pid_value": "4569096"
          }
        }
      ]
    }, 
    "communities": [
      {
        "id": "dcase"
      }
    ], 
    "version": "v2.2", 
    "keywords": [
      "DCASE", 
      "Sound event detection"
    ], 
    "publication_date": "2020-03-07", 
    "creators": [
      {
        "affiliation": "Universit\u00e9 de Lorraine, CNRS, Inria, Loria, F-54000 Nancy, France", 
        "name": "Turpault, Nicolas"
      }, 
      {
        "affiliation": "Universit\u00e9 de Lorraine, CNRS, Inria, Loria, F-54000 Nancy, France", 
        "name": "Serizel, Romain"
      }
    ], 
    "access_right": "open", 
    "resource_type": {
      "type": "video", 
      "title": "Video/Audio"
    }, 
    "related_identifiers": [
      {
        "scheme": "url", 
        "identifier": "https://hal.inria.fr/hal-02160855v2", 
        "relation": "isSupplementTo", 
        "resource_type": "publication-conferencepaper"
      }, 
      {
        "scheme": "url", 
        "identifier": "https://hal.inria.fr/hal-02355573", 
        "relation": "isSupplementTo", 
        "resource_type": "publication-conferencepaper"
      }, 
      {
        "scheme": "doi", 
        "identifier": "10.5281/zenodo.3550598", 
        "relation": "isVersionOf"
      }
    ]
  }
}
2,377
9,266
views
downloads
All versions This version
Views 2,377392
Downloads 9,2663,039
Data volume 31.4 TB7.3 TB
Unique views 1,773329
Unique downloads 4,2021,191

Share

Cite as