Video/Audio Open Access

Test Database for the Assessment of Immersive Audio Systems

Ogden, Harry; Stubbs, Jess; Kearney, Gavin


JSON-LD (schema.org) Export

{
  "inLanguage": {
    "alternateName": "eng", 
    "@type": "Language", 
    "name": "English"
  }, 
  "description": "<p>This repository contains a new library of listening material, for the testing of immersive audio systems, that includes synthetic sound sources, speech recordings and short musical and instrumental performances.</p>\n\n<p>Evaluation of perceived audio quality is an essential part of spatial audio system design, where listening tests help to reveal any<br>\nspatial and timbral distortions that occur. Selection of audio stimuli constitutes an important part of listening test methods, as different stimuli will reveal specific properties of the perceived audio. A wide range of listening test material is therefore required, from which the most appropriate stimuli can be chosen based on the context of the test. For researchers in the field of immersive audio, availability of such materials can be sparse due to the differing requirements of surround sound and ambisonic testing. To this end a new test database has been developed, for use in the spatial and timbral evaluation of immersive audio systems.</p>\n\n<p>---</p>\n\n<p>The data is organised as follows:</p>\n\n<p><strong>Source Files</strong><br>\n- Source_2-Pop (1kHz tone, one frame long (25ms or 50ms))<br>\n- Source_3rdOctaveBandPinkNoise (10 &amp; 60 second durations, frequency bands; 32, 64, 125, 250, 500, 1k, 2k, 4k, 8k, 16kHz)<br>\n- Source_500-2000Hz_PinkNoise (Pink noise with frequencies below 500Hz removed &amp; cut-off at 2kHz)<br>\n- Source_AcousticGuitar&amp;Vocals (4 original pieces consisting of multiple guitar, vocal, drum &amp; shaker tracks)<br>\n- Source_ConversationalSpeech (selection of short conversations &amp; passages recorded in an anecohic chamber and reverberant classroom)<br>\n- Source_DTMF_Tones (Tone pairs consisting of lower &amp; higher frequencies with durations of 1s, 10s, 100ms &amp; 200ms)<br>\n- Source_GreenwichTimeSignal (series of five 0.1 second, 1 kHz tone bursts separated by 0.9 seconds of silence concluded by a 0.5 second 1 kHz tone)<br>\n- Source_PinkNoise (durations of 1s, 10s, 60s, 100ms &amp; 200ms)<br>\n- Source_SinePureTones (1s, 10s, 60s, 100ms &amp; 200ms durations, frequencies; 20, 32, 64, 125, 250, 440, 500, 1k, 2k, 4k, &nbsp;8k, 16k, 20kHz)<br>\n- Source_SpeechMaterial(Female) (includes sentences &amp; passages; speaker positions &amp; names; azimuth &amp; elevation angles (-180 to +180); Numbers, alphabet &amp; assorted audio terms)<br>\n- Source_SpeechMaterial(Male) (includes sentences &amp; passages; speaker positions &amp; names; azimuth &amp; elevation angles (-180 to +180); Numbers, alphabet &amp; assorted audio terms)<br>\n- Source_SpeechMaterial(Mandarin) (includes only sentences &amp; passages)<br>\n- Source_WhiteNoise (durations of 1s, 10s, 60s, 100ms &amp; 200ms)</p>\n\n<p><strong>Ambisonically Encoded Files</strong><br>\n- XOrder_3rdOctPinkNoise (X, <strong>where X = 1st, 3rd, 5th, 7th</strong>, Order encoded 1/3 Octave Band Pink noise files)&nbsp;<br>\n&nbsp; &nbsp; - Cube (sources encoded to cube face positions)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 3rdOctPinkNoise_32Hz_-20dBFS_10s_48kHz_24Bit (10 seconds, 32Hz center frequency)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 3rdOctPinkNoise_64Hz_-20dBFS_10s_48kHz_24Bit (10 seconds, 64Hz center frequency)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 3rdOctPinkNoise_125Hz_-20dBFS_10s_48kHz_24Bit (10 seconds, 125Hz center frequency)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 3rdOctPinkNoise_250Hz_-20dBFS_10s_48kHz_24Bit (10 seconds, 250Hz center frequency)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 3rdOctPinkNoise_500Hz_-20dBFS_10s_48kHz_24Bit (10 seconds, 500Hz center frequency)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 3rdOctPinkNoise_1000Hz_-20dBFS_10s_48kHz_24Bit (10 seconds, 1kHz center frequency)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 3rdOctPinkNoise_2000Hz_-20dBFS_10s_48kHz_24Bit (10 seconds, 2kHz center frequency)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 3rdOctPinkNoise_4000Hz_-20dBFS_10s_48kHz_24Bit (10 seconds, 4kHz center frequency)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 3rdOctPinkNoise_8000Hz_-20dBFS_10s_48kHz_24Bit (10 seconds, 8kHz center frequency)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 3rdOctPinkNoise_16000Hz_-20dBFS_10s_48kHz_24Bit (10 seconds, 16kHz center frequency)<br>\n&nbsp; &nbsp; - Dodecahedron (sources encoded to dodecahedron face positions)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 3rdOctPinkNoise_32Hz_-20dBFS_10s_48kHz_24Bit (10 seconds, 32Hz center frequency)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 3rdOctPinkNoise_64Hz_-20dBFS_10s_48kHz_24Bit (10 seconds, 64Hz center frequency)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 3rdOctPinkNoise_125Hz_-20dBFS_10s_48kHz_24Bit (10 seconds, 125Hz center frequency)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 3rdOctPinkNoise_250Hz_-20dBFS_10s_48kHz_24Bit (10 seconds, 250Hz center frequency)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 3rdOctPinkNoise_500Hz_-20dBFS_10s_48kHz_24Bit (10 seconds, 500Hz center frequency)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 3rdOctPinkNoise_1000Hz_-20dBFS_10s_48kHz_24Bit (10 seconds, 1kHz center frequency)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 3rdOctPinkNoise_2000Hz_-20dBFS_10s_48kHz_24Bit (10 seconds, 2kHz center frequency)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 3rdOctPinkNoise_4000Hz_-20dBFS_10s_48kHz_24Bit (10 seconds, 4kHz center frequency)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 3rdOctPinkNoise_8000Hz_-20dBFS_10s_48kHz_24Bit (10 seconds, 8kHz center frequency)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 3rdOctPinkNoise_16000Hz_-20dBFS_10s_48kHz_24Bit (10 seconds, 16kHz center frequency)<br>\n- XOrder_500-2000Hz_PinkNoise (X Order encoded 500-2000Hz Pink noise files)<br>\n&nbsp; &nbsp; - Cube (sources encoded to cube face positions)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 500-2000Hz_PinkNoise_-20dBFS_1s_48kHz_24Bit (1 second)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 500-2000Hz_PinkNoise_-20dBFS_10s_48kHz_24Bit (10 seconds)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 500-2000Hz_PinkNoise_-20dBFS_60s_48kHz_24Bit (60 seconds)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 500-2000Hz_PinkNoise_-20dBFS_100ms_48kHz_24Bit (100 milliseconds)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 500-2000Hz_PinkNoise_-20dBFS_200ms_48kHz_24Bit (200 milliseconds)<br>\n&nbsp; &nbsp; - Dodecahedron (sources encoded to dodecahedron face positions)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 500-2000Hz_PinkNoise_-20dBFS_1s_48kHz_24Bit (1 second)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 500-2000Hz_PinkNoise_-20dBFS_10s_48kHz_24Bit (10 seconds)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 500-2000Hz_PinkNoise_-20dBFS_60s_48kHz_24Bit (60 seconds)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 500-2000Hz_PinkNoise_-20dBFS_100ms_48kHz_24Bit (100 milliseconds)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - 500-2000Hz_PinkNoise_-20dBFS_200ms_48kHz_24Bit (200 milliseconds)<br>\n- XOrder_BroadcastSources (X Order encoded 2-pip, GTS &amp; DTMF tone files)<br>\n&nbsp; &nbsp; - 2-Pop_-20dBFS_25ms_48kHz_24Bit (encoded to cube face positions)<br>\n&nbsp; &nbsp; - 2-Pop_-20dBFS_50ms_48kHz_24Bit (encoded to cube face positions)<br>\n&nbsp; &nbsp; - DTMF_Tones_-20dBFS_1s_48kHz_24Bit (1 second, encoded to front center position)<br>\n&nbsp; &nbsp; - DTMF_Tones_-20dBFS_10s_48kHz_24Bit (10 seconds, encoded to front center position)<br>\n&nbsp; &nbsp; - DTMF_Tones_-20dBFS_100ms_48kHz_24Bit (100 milliseconds, encoded to front center position)<br>\n&nbsp; &nbsp; - DTMF_Tones_-20dBFS_200ms_48kHz_24Bit (200 milliseconds, encoded to front center position)<br>\n&nbsp; &nbsp; - GTS_Full_-20dBFS_48kHz_24Bit (encoded to cube face positions)<br>\n- XOrder_ExampleTestFiles (X Order encoded Pink noise announced example test files e.g. &quot;Front Center&quot; *Noise burst at front center*)<br>\n&nbsp; &nbsp; - 1Second (announced 1 second pink noise encoded to ITU-R BS.2159-4 and SMPTE 2603 speaker positions)<br>\n&nbsp; &nbsp; - 100ms_3Bursts (announced 3 bursts of 100ms pink noise encoded to ITU-R BS.2159-4 and SMPTE 2603 speaker positions)<br>\n&nbsp; &nbsp; - 200ms_3Bursts (announced 3 bursts of 200ms pink noise encoded to ITU-R BS.2159-4 and SMPTE 2603 speaker positions)<br>\n- XOrder_MovingSources (X Order encoded noise sources that circle azimuth/elevation at specified speeds)<br>\n&nbsp; &nbsp; - PinkNoise_-20dBFS_60s_48kHz_24Bit (60 seconds, azimuth &amp; elevation pink noise at 45, 90 &amp; 180 degrees per second)<br>\n&nbsp; &nbsp; - WhiteNoise_-20dBFS_60s_48kHz_24Bit (60 seconds, azimuth &amp; elevation white noise at 45, 90 &amp; 180 degrees per second)<br>\n- XOrder_PinkNoise (X Order encoded Pink noise files)<br>\n&nbsp; &nbsp; - Cube (sources encoded to cube face positions)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - PinkNoise_-20dBFS_1s_48kHz_24Bit (1 second)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - PinkNoise_-20dBFS_10s_48kHz_24Bit (10 seconds)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - PinkNoise_-20dBFS_60s_48kHz_24Bit (60 seconds)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - PinkNoise_-20dBFS_100ms_48kHz_24Bit (100 milliseconds)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - PinkNoise_-20dBFS_200ms_48kHz_24Bit (200 milliseconds)<br>\n&nbsp; &nbsp; - Dodecahedron (sources encoded to dodecahedron face positions)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - PinkNoise_-20dBFS_1s_48kHz_24Bit (1 second)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - PinkNoise_-20dBFS_10s_48kHz_24Bit (10 seconds)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - PinkNoise_-20dBFS_60s_48kHz_24Bit (60 seconds)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - PinkNoise_-20dBFS_100ms_48kHz_24Bit (100 milliseconds)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - PinkNoise_-20dBFS_200ms_48kHz_24Bit (200 milliseconds)<br>\n- XOrder_WhiteNoise (X Order encoded White noise files)<br>\n&nbsp; &nbsp; - Cube (sources encoded to cube face positions)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - WhiteNoise_-20dBFS_1s_48kHz_24Bit (1 second)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - WhiteNoise_-20dBFS_10s_48kHz_24Bit (10 seconds)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - WhiteNoise_-20dBFS_60s_48kHz_24Bit (60 seconds)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - WhiteNoise_-20dBFS_100ms_48kHz_24Bit (100 milliseconds)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - WhiteNoise_-20dBFS_200ms_48kHz_24Bit (200 milliseconds)<br>\n&nbsp; &nbsp; - Dodecahedron (sources encoded to dodecahedron face positions)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - WhiteNoise_-20dBFS_1s_48kHz_24Bit (1 second)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - WhiteNoise_-20dBFS_10s_48kHz_24Bit (10 seconds)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - WhiteNoise_-20dBFS_60s_48kHz_24Bit (60 seconds)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - WhiteNoise_-20dBFS_100ms_48kHz_24Bit (100 milliseconds)<br>\n&nbsp; &nbsp; &nbsp; &nbsp; - WhiteNoise_-20dBFS_200ms_48kHz_24Bit (200 milliseconds)</p>\n\n<p>---</p>\n\n<p>For any enquiries regarding the data please email: ho581@york.ac.uk</p>\n\n<p>Data produced by Harry Ogden at the Audio Lab, Department of Electronics Engineering, University of York<br>\nContact: ho581@york.ac.uk</p>\n\n<p>Funding was provided by UK Engineering and Physical Sciences Research Council (EPSRC), the Department of Electronic Engineering at the University of York.</p>", 
  "license": "https://creativecommons.org/licenses/by/4.0/legalcode", 
  "creator": [
    {
      "affiliation": "University of York", 
      "@type": "Person", 
      "name": "Ogden, Harry"
    }, 
    {
      "affiliation": "University of York", 
      "@type": "Person", 
      "name": "Stubbs, Jess"
    }, 
    {
      "affiliation": "University of York", 
      "@id": "https://orcid.org/0000-0002-0692-236X", 
      "@type": "Person", 
      "name": "Kearney, Gavin"
    }
  ], 
  "url": "https://zenodo.org/record/2602033", 
  "datePublished": "2019-03-25", 
  "version": "1.0", 
  "@context": "https://schema.org/", 
  "identifier": "https://doi.org/10.5281/zenodo.2602033", 
  "@id": "https://doi.org/10.5281/zenodo.2602033", 
  "@type": "MediaObject", 
  "name": "Test Database for the Assessment of Immersive Audio Systems"
}
314
744
views
downloads
All versions This version
Views 314315
Downloads 744744
Data volume 1.0 TB1.0 TB
Unique views 280281
Unique downloads 109109

Share

Cite as