There is a newer version of this record available.

Video/Audio Open Access

DESED_synthetic

Turpault, Nicolas; Serizel, Romain


JSON-LD (schema.org) Export

{
  "description": "<p>Link to the associated github repository: <a href=\"https://github.com/turpaultn/Desed\">https://github.com/turpaultn/Desed</a></p>\n\n<p>Link to the papers: <a href=\"https://hal.inria.fr/hal-02160855\"><em>https://hal.inria.fr/hal-02160855</em></a>,&nbsp; <a href=\"https://hal.inria.fr/hal-02355573v1\">https://hal.inria.fr/hal-02355573v1</a></p>\n\n<p>Domestic Environment Sound Event Detection (DESED).</p>\n\n<p><strong>Description</strong><br>\nThis dataset is the synthetic part of the DESED dataset. It allows creating mixtures of isolated sounds and backgrounds.</p>\n\n<p>There is the material to:</p>\n\n<ul>\n\t<li>Reproduce the DCASE 2019 task 4 synthetic dataset</li>\n\t<li>Reproduce the DCASE 2020 task 4 synthetic train dataset</li>\n\t<li>Creating new mixtures from isolated foreground sounds and background sounds.</li>\n</ul>\n\n<p><strong>Files:</strong></p>\n\n<p><strong>If you want to generate new audio mixtures yourself from the original files.</strong></p>\n\n<ol>\n\t<li><strong>DESED_synth_soundbank.tar.gz</strong> : Raw data used to generate mixtures.</li>\n\t<li><strong>DESED_synth_dcase2019jams.tar.gz</strong>: JAMS files, metadata describing how to recreate the&nbsp; dcase2019 synthetic dataset<strong> </strong></li>\n\t<li><strong>DESED_synth_dcase20_train_jams.tar: </strong>JAMS files, metadata describing how to recreate the dcase2020 synthetic train dataset</li>\n\t<li><strong>DESED_synth_source.tar.gz: </strong>src files you can find on github: <a href=\"https://github.com/turpaultn/DESED\">https://github.com/turpaultn/DESED</a> . Source files to generate dcase2019 files from soundbank or generate new ones. (code can be outdated here, recommended to go in the github repo)</li>\n</ol>\n\n<p><strong>If you simply want the evaluation synthetic dataset used in DCASE 2019 task 4.</strong></p>\n\n<ol>\n\t<li><strong>DESED_synth_eval_dcase2019.tar.gz</strong><strong> </strong>:<strong> </strong>Evaluation audio and metadata files used in dcase 2019 task 4.</li>\n</ol>\n\n<p>&nbsp;</p>\n\n<p>The mixtures are generated using Scaper (https://github.com/justinsalamon/scaper) [1].</p>\n\n<p>* Background files are extracted from SINS [2], MUSAN [3] or Youtube and have been selected because they contain a very low amount of our sound event classes.<br>\n* Foreground files are extracted from Freesound [4][5] and manually verified to check the quality and segmented to remove silences.</p>\n\n<p><strong>References</strong><br>\n[1] J. Salamon, D. MacConnell, M. Cartwright, P. Li, and J. P. Bello. Scaper: A library for soundscape synthesis and augmentation<br>\nIn IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, Oct. 2017.</p>\n\n<p>[2] Gert Dekkers, Steven Lauwereins, Bart Thoen, Mulu Weldegebreal Adhana, Henk Brouckxon, Toon van Waterschoot, Bart Vanrumste, Marian Verhelst, and Peter Karsmakers.<br>\nThe SINS database for detection of daily activities in a home environment using an acoustic sensor network.<br>\nIn Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017), 32&ndash;36. November 2017.</p>\n\n<p>[3] David Snyder and Guoguo Chen and Daniel Povey.<br>\nMUSAN: A Music, Speech, and Noise Corpus.<br>\narXiv, 1510.08484, 2015.</p>\n\n<p>[4] F. Font, G. Roma &amp; X. Serra. Freesound technical demo. In Proceedings of the 21st ACM international conference on Multimedia. ACM, 2013.<br>\n&nbsp;<br>\n[5] E. Fonseca, J. Pons, X. Favory, F. Font, D. Bogdanov, A. Ferraro, S. Oramas, A. Porter &amp; X. Serra. Freesound Datasets: A Platform for the Creation of Open Audio Datasets.<br>\nIn Proceedings of the 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017.</p>\n\n<p>&nbsp;</p>", 
  "license": "https://creativecommons.org/licenses/by/4.0/legalcode", 
  "creator": [
    {
      "affiliation": "Universit\u00e9 de Lorraine, CNRS, Inria, Loria, F-54000 Nancy, France", 
      "@type": "Person", 
      "name": "Turpault, Nicolas"
    }, 
    {
      "affiliation": "Universit\u00e9 de Lorraine, CNRS, Inria, Loria, F-54000 Nancy, France", 
      "@type": "Person", 
      "name": "Serizel, Romain"
    }
  ], 
  "url": "https://zenodo.org/record/3713328", 
  "datePublished": "2020-03-07", 
  "keywords": [
    "DCASE", 
    "Sound event detection"
  ], 
  "version": "v2.2", 
  "contributor": [
    {
      "affiliation": "Adobe Research, San Francisco CA, United States", 
      "@type": "Person", 
      "name": "Salamon, Justin"
    }, 
    {
      "affiliation": "Language Technologies Institute, Carnegie Mellon University, Pittsburgh PA, United States", 
      "@type": "Person", 
      "name": "Shah, Ankit"
    }, 
    {
      "affiliation": "Google, Inc", 
      "@type": "Person", 
      "name": "Wisdom, Scott"
    }, 
    {
      "affiliation": "Google, Inc", 
      "@type": "Person", 
      "name": "Hershey, John"
    }, 
    {
      "affiliation": "Google, Inc", 
      "@type": "Person", 
      "name": "Erdogan, Hakan"
    }
  ], 
  "@context": "https://schema.org/", 
  "identifier": "https://doi.org/10.5281/zenodo.3713328", 
  "@id": "https://doi.org/10.5281/zenodo.3713328", 
  "@type": "MediaObject", 
  "name": "DESED_synthetic"
}
2,377
9,266
views
downloads
All versions This version
Views 2,377392
Downloads 9,2663,039
Data volume 31.4 TB7.3 TB
Unique views 1,773329
Unique downloads 4,2021,191

Share

Cite as