There is a newer version of this record available.

Video/Audio Open Access

DESED_synthetic

Turpault, Nicolas; Serizel, Romain


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000ngm##2200000uu#4500</leader>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">DCASE</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Sound event detection</subfield>
  </datafield>
  <controlfield tag="005">20210228213733.0</controlfield>
  <controlfield tag="001">3713328</controlfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Université de Lorraine, CNRS, Inria, Loria, F-54000 Nancy, France</subfield>
    <subfield code="a">Serizel, Romain</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Adobe Research, San Francisco CA, United States</subfield>
    <subfield code="4">res</subfield>
    <subfield code="a">Salamon, Justin</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Language Technologies Institute, Carnegie Mellon University, Pittsburgh PA, United States</subfield>
    <subfield code="4">res</subfield>
    <subfield code="a">Shah, Ankit</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Google, Inc</subfield>
    <subfield code="4">res</subfield>
    <subfield code="a">Wisdom, Scott</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Google, Inc</subfield>
    <subfield code="4">res</subfield>
    <subfield code="a">Hershey, John</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Google, Inc</subfield>
    <subfield code="4">res</subfield>
    <subfield code="a">Erdogan, Hakan</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">2888312</subfield>
    <subfield code="z">md5:16053563caa8e761eec043f44255d96f</subfield>
    <subfield code="u">https://zenodo.org/record/3713328/files/DESED_synth_dcase2019jams.tar.gz</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">1113266</subfield>
    <subfield code="z">md5:efe1d117cf3ff7a96341ca565f5ccf4c</subfield>
    <subfield code="u">https://zenodo.org/record/3713328/files/DESED_synth_dcase20_train_jams.tar.gz</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">7710291574</subfield>
    <subfield code="z">md5:e1aad0a714bb98d2b58f3d62122077b8</subfield>
    <subfield code="u">https://zenodo.org/record/3713328/files/DESED_synth_eval_dcase2019.tar.gz</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">2423131154</subfield>
    <subfield code="z">md5:f061ed2322be7d338ab5f9628750ce6a</subfield>
    <subfield code="u">https://zenodo.org/record/3713328/files/DESED_synth_soundbank.tar.gz</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">27706</subfield>
    <subfield code="z">md5:b1d91688a56740c9ec63afc3fc769c46</subfield>
    <subfield code="u">https://zenodo.org/record/3713328/files/DESED_synth_source.tar.gz</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2020-03-07</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">user-dcase</subfield>
    <subfield code="o">oai:zenodo.org:3713328</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">Université de Lorraine, CNRS, Inria, Loria, F-54000 Nancy, France</subfield>
    <subfield code="a">Turpault, Nicolas</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">DESED_synthetic</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-dcase</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;Link to the associated github repository: &lt;a href="https://github.com/turpaultn/Desed"&gt;https://github.com/turpaultn/Desed&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Link to the papers: &lt;a href="https://hal.inria.fr/hal-02160855"&gt;&lt;em&gt;https://hal.inria.fr/hal-02160855&lt;/em&gt;&lt;/a&gt;,&amp;nbsp; &lt;a href="https://hal.inria.fr/hal-02355573v1"&gt;https://hal.inria.fr/hal-02355573v1&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Domestic Environment Sound Event Detection (DESED).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Description&lt;/strong&gt;&lt;br&gt;
This dataset is the synthetic part of the DESED dataset. It allows creating mixtures of isolated sounds and backgrounds.&lt;/p&gt;

&lt;p&gt;There is the material to:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;Reproduce the DCASE 2019 task 4 synthetic dataset&lt;/li&gt;
	&lt;li&gt;Reproduce the DCASE 2020 task 4 synthetic train dataset&lt;/li&gt;
	&lt;li&gt;Creating new mixtures from isolated foreground sounds and background sounds.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Files:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you want to generate new audio mixtures yourself from the original files.&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
	&lt;li&gt;&lt;strong&gt;DESED_synth_soundbank.tar.gz&lt;/strong&gt; : Raw data used to generate mixtures.&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;DESED_synth_dcase2019jams.tar.gz&lt;/strong&gt;: JAMS files, metadata describing how to recreate the&amp;nbsp; dcase2019 synthetic dataset&lt;strong&gt; &lt;/strong&gt;&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;DESED_synth_dcase20_train_jams.tar: &lt;/strong&gt;JAMS files, metadata describing how to recreate the dcase2020 synthetic train dataset&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;DESED_synth_source.tar.gz: &lt;/strong&gt;src files you can find on github: &lt;a href="https://github.com/turpaultn/DESED"&gt;https://github.com/turpaultn/DESED&lt;/a&gt; . Source files to generate dcase2019 files from soundbank or generate new ones. (code can be outdated here, recommended to go in the github repo)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;If you simply want the evaluation synthetic dataset used in DCASE 2019 task 4.&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
	&lt;li&gt;&lt;strong&gt;DESED_synth_eval_dcase2019.tar.gz&lt;/strong&gt;&lt;strong&gt; &lt;/strong&gt;:&lt;strong&gt; &lt;/strong&gt;Evaluation audio and metadata files used in dcase 2019 task 4.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;p&gt;The mixtures are generated using Scaper (https://github.com/justinsalamon/scaper) [1].&lt;/p&gt;

&lt;p&gt;* Background files are extracted from SINS [2], MUSAN [3] or Youtube and have been selected because they contain a very low amount of our sound event classes.&lt;br&gt;
* Foreground files are extracted from Freesound [4][5] and manually verified to check the quality and segmented to remove silences.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;References&lt;/strong&gt;&lt;br&gt;
[1] J. Salamon, D. MacConnell, M. Cartwright, P. Li, and J. P. Bello. Scaper: A library for soundscape synthesis and augmentation&lt;br&gt;
In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, Oct. 2017.&lt;/p&gt;

&lt;p&gt;[2] Gert Dekkers, Steven Lauwereins, Bart Thoen, Mulu Weldegebreal Adhana, Henk Brouckxon, Toon van Waterschoot, Bart Vanrumste, Marian Verhelst, and Peter Karsmakers.&lt;br&gt;
The SINS database for detection of daily activities in a home environment using an acoustic sensor network.&lt;br&gt;
In Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017), 32&amp;ndash;36. November 2017.&lt;/p&gt;

&lt;p&gt;[3] David Snyder and Guoguo Chen and Daniel Povey.&lt;br&gt;
MUSAN: A Music, Speech, and Noise Corpus.&lt;br&gt;
arXiv, 1510.08484, 2015.&lt;/p&gt;

&lt;p&gt;[4] F. Font, G. Roma &amp;amp; X. Serra. Freesound technical demo. In Proceedings of the 21st ACM international conference on Multimedia. ACM, 2013.&lt;br&gt;
&amp;nbsp;&lt;br&gt;
[5] E. Fonseca, J. Pons, X. Favory, F. Font, D. Bogdanov, A. Ferraro, S. Oramas, A. Porter &amp;amp; X. Serra. Freesound Datasets: A Platform for the Creation of Open Audio Datasets.&lt;br&gt;
In Proceedings of the 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017.&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">url</subfield>
    <subfield code="i">isSupplementTo</subfield>
    <subfield code="a">https://hal.inria.fr/hal-02160855v2</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">url</subfield>
    <subfield code="i">isSupplementTo</subfield>
    <subfield code="a">https://hal.inria.fr/hal-02355573</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.3550598</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.3713328</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">video</subfield>
  </datafield>
</record>
2,371
9,263
views
downloads
All versions This version
Views 2,371392
Downloads 9,2633,039
Data volume 31.4 TB7.3 TB
Unique views 1,767329
Unique downloads 4,1991,191

Share

Cite as