Video/Audio Open Access


Turpault, Nicolas; Serizel, Romain

DataCite XML Export

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="" xmlns="" xsi:schemaLocation="">
  <identifier identifierType="DOI">10.5281/zenodo.4569096</identifier>
      <creatorName>Turpault, Nicolas</creatorName>
      <affiliation>Université de Lorraine, CNRS, Inria, Loria, F-54000 Nancy, France</affiliation>
      <creatorName>Serizel, Romain</creatorName>
      <affiliation>Université de Lorraine, CNRS, Inria, Loria, F-54000 Nancy, France</affiliation>
    <subject>Sound event detection</subject>
    <date dateType="Issued">2020-03-07</date>
  <resourceType resourceTypeGeneral="Audiovisual"/>
    <alternateIdentifier alternateIdentifierType="url"></alternateIdentifier>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsSupplementTo" resourceTypeGeneral="ConferencePaper"></relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsSupplementTo" resourceTypeGeneral="ConferencePaper"></relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.5281/zenodo.3550598</relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsPartOf"></relatedIdentifier>
    <rights rightsURI="">Creative Commons Attribution 4.0 International</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
    <description descriptionType="Abstract">&lt;p&gt;Link to the associated github repository: &lt;a href=""&gt;;/a&gt;&lt;/p&gt;

&lt;p&gt;Link to the papers: &lt;a href=""&gt;&lt;em&gt;;/em&gt;&lt;/a&gt;,&amp;nbsp; &lt;a href=""&gt;;/a&gt;&lt;/p&gt;

&lt;p&gt;Domestic Environment Sound Event Detection (DESED).&lt;/p&gt;

This dataset is the synthetic part of the DESED dataset. It allows creating mixtures of isolated sounds and backgrounds.&lt;/p&gt;

&lt;p&gt;There is the material to:&lt;/p&gt;

	&lt;li&gt;Reproduce the DCASE 2019 task 4 synthetic dataset&lt;/li&gt;
	&lt;li&gt;Reproduce the DCASE 2020 task 4 synthetic train dataset&lt;/li&gt;
	&lt;li&gt;Creating new mixtures from isolated foreground sounds and background sounds.&lt;/li&gt;


&lt;p&gt;&lt;strong&gt;If you want to generate new audio mixtures yourself from the original files.&lt;/strong&gt;&lt;/p&gt;

	&lt;li&gt;&lt;strong&gt;DESED_synth_soundbank.tar.gz&lt;/strong&gt; : Raw data used to generate mixtures.&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;DESED_synth_dcase2019jams.tar.gz&lt;/strong&gt;: JAMS files, metadata describing how to recreate the&amp;nbsp; dcase2019 synthetic dataset&lt;strong&gt; &lt;/strong&gt;&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;DESED_synth_dcase20_train_val_jams.tar: &lt;/strong&gt;JAMS files, metadata describing how to recreate the dcase2020 synthetic train and valid dataset.&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;DESED_synth_dcase20_eval_jams.tar: &lt;/strong&gt;JAMS files, metadata describing how to recreate the dcase2020 synthetic eval dataset (only the basic one, variants of it have been made but not presented here).&lt;/li&gt;

&lt;p&gt;&lt;strong&gt;If you simply want the evaluation synthetic dataset used in DCASE 2019 task 4.&lt;/strong&gt;&lt;/p&gt;

	&lt;li&gt;&lt;strong&gt;DESED_synth_eval_dcase2019.tar.gz&lt;/strong&gt;&lt;strong&gt; &lt;/strong&gt;:&lt;strong&gt; &lt;/strong&gt;Evaluation audio and metadata files used in dcase 2019 task 4.&lt;/li&gt;


&lt;p&gt;The mixtures are generated using Scaper ( [1].&lt;/p&gt;

&lt;p&gt;* Background files are extracted from SINS [2], MUSAN [3] or Youtube and have been selected because they contain a very low amount of our sound event classes.&lt;br&gt;
* Foreground files are extracted from Freesound [4][5] and manually verified to check the quality and segmented to remove silences.&lt;/p&gt;

[1] J. Salamon, D. MacConnell, M. Cartwright, P. Li, and J. P. Bello. Scaper: A library for soundscape synthesis and augmentation&lt;br&gt;
In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, Oct. 2017.&lt;/p&gt;

&lt;p&gt;[2] Gert Dekkers, Steven Lauwereins, Bart Thoen, Mulu Weldegebreal Adhana, Henk Brouckxon, Toon van Waterschoot, Bart Vanrumste, Marian Verhelst, and Peter Karsmakers.&lt;br&gt;
The SINS database for detection of daily activities in a home environment using an acoustic sensor network.&lt;br&gt;
In Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017), 32&amp;ndash;36. November 2017.&lt;/p&gt;

&lt;p&gt;[3] David Snyder and Guoguo Chen and Daniel Povey.&lt;br&gt;
MUSAN: A Music, Speech, and Noise Corpus.&lt;br&gt;
arXiv, 1510.08484, 2015.&lt;/p&gt;

&lt;p&gt;[4] F. Font, G. Roma &amp;amp; X. Serra. Freesound technical demo. In Proceedings of the 21st ACM international conference on Multimedia. ACM, 2013.&lt;br&gt;
[5] E. Fonseca, J. Pons, X. Favory, F. Font, D. Bogdanov, A. Ferraro, S. Oramas, A. Porter &amp;amp; X. Serra. Freesound Datasets: A Platform for the Creation of Open Audio Datasets.&lt;br&gt;
In Proceedings of the 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017.&lt;/p&gt;

All versions This version
Views 2,602386
Downloads 9,589900
Data volume 32.6 TB6.2 TB
Unique views 1,906316
Unique downloads 4,386565


Cite as