There is a newer version of this record available.

Video/Audio Open Access

DESED_synthetic

Turpault, Nicolas; Serizel, Romain


DataCite XML Export

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4.1/metadata.xsd">
  <identifier identifierType="DOI">10.5281/zenodo.3713328</identifier>
  <creators>
    <creator>
      <creatorName>Turpault, Nicolas</creatorName>
      <givenName>Nicolas</givenName>
      <familyName>Turpault</familyName>
      <affiliation>Université de Lorraine, CNRS, Inria, Loria, F-54000 Nancy, France</affiliation>
    </creator>
    <creator>
      <creatorName>Serizel, Romain</creatorName>
      <givenName>Romain</givenName>
      <familyName>Serizel</familyName>
      <affiliation>Université de Lorraine, CNRS, Inria, Loria, F-54000 Nancy, France</affiliation>
    </creator>
  </creators>
  <titles>
    <title>DESED_synthetic</title>
  </titles>
  <publisher>Zenodo</publisher>
  <publicationYear>2020</publicationYear>
  <subjects>
    <subject>DCASE</subject>
    <subject>Sound event detection</subject>
  </subjects>
  <dates>
    <date dateType="Issued">2020-03-07</date>
  </dates>
  <resourceType resourceTypeGeneral="Audiovisual"/>
  <alternateIdentifiers>
    <alternateIdentifier alternateIdentifierType="url">https://zenodo.org/record/3713328</alternateIdentifier>
  </alternateIdentifiers>
  <relatedIdentifiers>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsSupplementTo" resourceTypeGeneral="ConferencePaper">https://hal.inria.fr/hal-02160855v2</relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsSupplementTo" resourceTypeGeneral="ConferencePaper">https://hal.inria.fr/hal-02355573</relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.5281/zenodo.3550598</relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsPartOf">https://zenodo.org/communities/dcase</relatedIdentifier>
  </relatedIdentifiers>
  <version>v2.2</version>
  <rightsList>
    <rights rightsURI="https://creativecommons.org/licenses/by/4.0/legalcode">Creative Commons Attribution 4.0 International</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
  </rightsList>
  <descriptions>
    <description descriptionType="Abstract">&lt;p&gt;Link to the associated github repository: &lt;a href="https://github.com/turpaultn/Desed"&gt;https://github.com/turpaultn/Desed&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Link to the papers: &lt;a href="https://hal.inria.fr/hal-02160855"&gt;&lt;em&gt;https://hal.inria.fr/hal-02160855&lt;/em&gt;&lt;/a&gt;,&amp;nbsp; &lt;a href="https://hal.inria.fr/hal-02355573v1"&gt;https://hal.inria.fr/hal-02355573v1&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Domestic Environment Sound Event Detection (DESED).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Description&lt;/strong&gt;&lt;br&gt;
This dataset is the synthetic part of the DESED dataset. It allows creating mixtures of isolated sounds and backgrounds.&lt;/p&gt;

&lt;p&gt;There is the material to:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;Reproduce the DCASE 2019 task 4 synthetic dataset&lt;/li&gt;
	&lt;li&gt;Reproduce the DCASE 2020 task 4 synthetic train dataset&lt;/li&gt;
	&lt;li&gt;Creating new mixtures from isolated foreground sounds and background sounds.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Files:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you want to generate new audio mixtures yourself from the original files.&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
	&lt;li&gt;&lt;strong&gt;DESED_synth_soundbank.tar.gz&lt;/strong&gt; : Raw data used to generate mixtures.&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;DESED_synth_dcase2019jams.tar.gz&lt;/strong&gt;: JAMS files, metadata describing how to recreate the&amp;nbsp; dcase2019 synthetic dataset&lt;strong&gt; &lt;/strong&gt;&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;DESED_synth_dcase20_train_jams.tar: &lt;/strong&gt;JAMS files, metadata describing how to recreate the dcase2020 synthetic train dataset&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;DESED_synth_source.tar.gz: &lt;/strong&gt;src files you can find on github: &lt;a href="https://github.com/turpaultn/DESED"&gt;https://github.com/turpaultn/DESED&lt;/a&gt; . Source files to generate dcase2019 files from soundbank or generate new ones. (code can be outdated here, recommended to go in the github repo)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;If you simply want the evaluation synthetic dataset used in DCASE 2019 task 4.&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
	&lt;li&gt;&lt;strong&gt;DESED_synth_eval_dcase2019.tar.gz&lt;/strong&gt;&lt;strong&gt; &lt;/strong&gt;:&lt;strong&gt; &lt;/strong&gt;Evaluation audio and metadata files used in dcase 2019 task 4.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;p&gt;The mixtures are generated using Scaper (https://github.com/justinsalamon/scaper) [1].&lt;/p&gt;

&lt;p&gt;* Background files are extracted from SINS [2], MUSAN [3] or Youtube and have been selected because they contain a very low amount of our sound event classes.&lt;br&gt;
* Foreground files are extracted from Freesound [4][5] and manually verified to check the quality and segmented to remove silences.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;References&lt;/strong&gt;&lt;br&gt;
[1] J. Salamon, D. MacConnell, M. Cartwright, P. Li, and J. P. Bello. Scaper: A library for soundscape synthesis and augmentation&lt;br&gt;
In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, Oct. 2017.&lt;/p&gt;

&lt;p&gt;[2] Gert Dekkers, Steven Lauwereins, Bart Thoen, Mulu Weldegebreal Adhana, Henk Brouckxon, Toon van Waterschoot, Bart Vanrumste, Marian Verhelst, and Peter Karsmakers.&lt;br&gt;
The SINS database for detection of daily activities in a home environment using an acoustic sensor network.&lt;br&gt;
In Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017), 32&amp;ndash;36. November 2017.&lt;/p&gt;

&lt;p&gt;[3] David Snyder and Guoguo Chen and Daniel Povey.&lt;br&gt;
MUSAN: A Music, Speech, and Noise Corpus.&lt;br&gt;
arXiv, 1510.08484, 2015.&lt;/p&gt;

&lt;p&gt;[4] F. Font, G. Roma &amp;amp; X. Serra. Freesound technical demo. In Proceedings of the 21st ACM international conference on Multimedia. ACM, 2013.&lt;br&gt;
&amp;nbsp;&lt;br&gt;
[5] E. Fonseca, J. Pons, X. Favory, F. Font, D. Bogdanov, A. Ferraro, S. Oramas, A. Porter &amp;amp; X. Serra. Freesound Datasets: A Platform for the Creation of Open Audio Datasets.&lt;br&gt;
In Proceedings of the 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017.&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description>
  </descriptions>
</resource>
2,371
9,263
views
downloads
All versions This version
Views 2,371392
Downloads 9,2633,039
Data volume 31.4 TB7.3 TB
Unique views 1,767329
Unique downloads 4,1991,191

Share

Cite as