Dataset Open Access

TAU Spatial Sound Events 2019 - Ambisonic and Microphone Array, Development Datasets

Sharath Adavanne; Archontis Politis; Tuomas Virtanen


Citation Style Language JSON Export

{
  "publisher": "Zenodo", 
  "DOI": "10.5281/zenodo.2599196", 
  "title": "TAU Spatial Sound Events 2019 - Ambisonic and Microphone Array, Development Datasets", 
  "issued": {
    "date-parts": [
      [
        2019, 
        2, 
        28
      ]
    ]
  }, 
  "abstract": "<p>This package consists of two development datasets, <strong>TAU Spatial Sound Events 2019 - Ambisonic</strong>&nbsp;and <strong>TAU Spatial Sound Events 2019 - Microphone Array</strong>. These datasets contain recordings from an identical scene, with <strong>TAU Spatial Sound Events 2019 - Ambisonic</strong>&nbsp;providing four-channel First-Order Ambisonic (FOA) recordings while <strong>TAU Spatial Sound Events 2019 - Microphone Array</strong>&nbsp;provides four-channel directional microphone recordings from a tetrahedral array configuration. Both formats are extracted from the same microphone array. The recordings in the two datasets consist of stationary point sources from multiple sound classes each associated with a temporal onset and offset time, and DOA coordinate represented using azimuth and elevation angle. These development datasets are part of the <a href=\"https://github.com/sharathadavanne/seld-dcase2019\">DCASE 2019 Sound Event Localization and Detection Task</a>.</p>\n\n<p>Both the development set consists of 400, one minute long recordings sampled at 48000 Hz, and divided into four cross-validation splits of 100 recordings each. These recordings were synthesized using spatial room impulse response (IRs) collected from five indoor locations, at 504 unique combinations of azimuth-elevation-distance. Furthermore, in order to synthesize the recordings, the collected IRs were convolved with <a href=\"http://www.cs.tut.fi/sgn/arg/dcase2016/task-sound-event-detection-in-synthetic-audio#audio-dataset\">isolated sound events dataset from DCASE 2016 task 2</a>. Finally, to create a realistic sound scene recording, natural ambient noise collected in the IR recording locations was added to the synthesized recordings such that the average SNR of the sound events was 30 dB.</p>\n\n<p>The IRs were collected in Finland by Tampere University between 12/2017 - 06/2018. The data collection received funding from the European Research Council, grant agreement 637422 EVERYSOUND.</p>\n\n<p><strong>Download instructions</strong></p>\n\n<p>The three files, &nbsp;<strong><em>foa_dev.z01</em></strong>,<strong><em> foa_dev.z02</em></strong>&nbsp;and <strong><em>foa_dev.zip</em></strong>, correspond to audio data of <strong>TAU Spatial Sound Events 2019 - Ambisonic</strong>&nbsp;development dataset.<br>\nThe two files, <strong><em>mic_dev.z01</em></strong>&nbsp;and, <strong><em>mic_dev.zip</em></strong>, correspond to audio data of <strong>TAU Spatial Sound Events 2019 - Microphone Array</strong>&nbsp;development dataset.<br>\nThe <strong><em>metadata_dev.zip</em></strong>&nbsp;is the common metadata for both <strong>TAU Spatial Sound Events 2019 - Ambisonic</strong>&nbsp;and <strong>TAU Spatial Sound Events 2019 - Microphone Array</strong>&nbsp;development datasets.</p>\n\n<p>Download the zip files corresponding to the dataset of interest and use your favorite compression tool to unzip these split zip files.<br>\n&nbsp;</p>", 
  "author": [
    {
      "family": "Sharath Adavanne"
    }, 
    {
      "family": "Archontis Politis"
    }, 
    {
      "family": "Tuomas Virtanen"
    }
  ], 
  "type": "dataset", 
  "id": "2599196"
}
2,475
14,548
views
downloads
All versions This version
Views 2,4751,762
Downloads 14,5486,909
Data volume 23.7 TB10.4 TB
Unique views 2,0821,472
Unique downloads 2,1511,717

Share

Cite as