Fabian-Robert Stöter
Soumitro Chakrabarty
Emanuël Habets
Bernd Edler
2018-04-16
<p><strong>LibriCount10 0dB Dataset</strong></p>
<p>This is the description to the LibriCount10 synthetic dataset for speaker count estimation.</p>
<p>Therefore for each recording we provide the ground truth number of speakers within the file name, where `k` in, `k_uniquefile.wav` is the maximum number of concurrent speakers with the 5 seconds of recording.</p>
<p>The dataset contains a simulated cocktail party environment of [0..10] speakers, mixed with 0dB SNR from random utterances of different speakers from the <a href="http://www.openslr.org/12/">LibriSpeech</a> `CleanTest` dataset.</p>
<p>All recordings are of 5s durations, and all speakers are active for the most part of the recording.</p>
<p>For each unique recording, we provide the audio wave file (16bits, 16kHz, mono) and an annotation `json` file with the same name as the recording.</p>
<p><strong>Metadata</strong></p>
<p>In the annotation file we provide information about the speakers sex, their unique speaker_id, and vocal activity within the mixture recording in samples. Note that these were automatically generated using a <a href="https://github.com/wiseman/py-webrtcvad">voice activity detection</a> system.</p>
<p>In the following example the annotation shows a speaker count of 3 speakers as can be extracted from the number of elements in the list:</p>
<pre><code class="language-json">[
{
"sex": "F",
"activity": [[0, 51076], [51396, 55400], [56681, 80000]],
"speaker_id": 1221
},
{
"sex": "F",
"activity": [[0, 51877], [56201, 80000]],
"speaker_id": 3570
},
{
"sex": "M",
"activity": [[0, 15681], [16161, 68213], [73498, 80000]],
"speaker_id": 5105
}
]</code></pre>
<p><br>
</p>
https://doi.org/10.5281/zenodo.1216072
oai:zenodo.org:1216072
Zenodo
https://doi.org/10.5281/zenodo.1216071
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
ICASSP 2018, Calgary, Canada
audio
dataset
speaker count estimation
LibriCount, a dataset for speaker count estimation
info:eu-repo/semantics/other