2024-03-19T06:02:57Z
https://zenodo.org/oai2d
oai:zenodo.org:4308731
2021-11-20T00:46:31Z
user-sigsep
openaire_data
user-c4dm
user-eu
Lordelo, Carlos
Ahlbäck, Sven
Benetos, Emmanouil
Dixon, Simon
Ohlsson, Patrik
2020-12-07
<p>The <strong>Tap & Fiddle</strong> consists of 28 stereo recordings of traditional Scandinavian fiddle tunes with accompanying foot-tapping, which is standard performance practice within these musical styles. Its corpora contains not only the mixed signals, but also the two isolated instrumental tracks that can be used as ground-truths for music source separation algorithms.</p>
<ol>
<li>The fiddle track (<em>harmonic track</em>);</li>
<li>The foot-tapping track (<em>percussive track</em>). </li>
</ol>
<p>Foot-tapping is very often an integral part of the musical expression in Scandinavian fiddle music as a percussive accompaniment. For instance, some studies have shown that the dance beat of the music can even be unintelligible without the foot-tapping part [1]. Notwithstanding, foot-tapping in performance of fiddle music has not been systematically studied yet. Hence, apart from contributing to the music source separation community, we hope that the release of <strong>Tap & Fiddle </strong>can also bring contributions to researchers and enthusiasts working with analysis of fiddle music as well as studies of metrical expression in music in general.</p>
<p><strong>Audio and Repertoire Characteristics:</strong></p>
<p>The set contains recordings of different dance types, including Norwegian Halling music, with straight single and double tapping as well as Swedish polska tunes, where tapping is considerably scarcer with tapping on beat 1 and 3 in 3-beat time being the most common way to tap. </p>
<p>The sound of the foot-tapping ranges from more soft foot-tapping produced by a sock-covered foot, to sharp, distinct and loud foot-tapping produced by shoes with hard heels on parquet. In addition, the loudness of the foot-tapping regardless of sound source in relation to the fiddle is varied between and within recordings.</p>
<p>It is also important to note that some of the recordings in the dataset are variations of the same Scandinavian fiddle tune. Those recordings are versions containing different acoustic conditions and audio characteristics for the foot-tapping and/or for the violin sound within the same tune.</p>
<p><strong>Recording Methodology: </strong></p>
<p>Each isolated signal was recorded by one fiddle player in a natural 30 m<sup>2 </sup>room with separate miking for the foot and the fiddle (violin), using close-up Shure SM-58 microphones and a Focusrite sound card recorded in Audacity on a Macbook PRO. The mixture signals were created by adding the two isolated signals together. All the recordings have been made using the same instrument, which was played by the same performer. </p>
<p>The audio files are uncompressed and saved as stereo <em>".wav" </em>files with sampling frequency of 44100 Hz and 32 bits per sample. The average duration for a recording in <strong>Tap & Fiddle</strong> is 65 seconds, totalising around 65 x 28 = 30m 20s of full play time.</p>
<p>The dataset is divided into a training set with 23 recordings and a test set with 5. We recommend that supervised approaches should be trained on the training set and tested on the test set.</p>
<p><strong>How to cite:</strong></p>
<p>If you use this dataset, please, provide the following citation in your work:</p>
<ul>
<li>C. Lordelo, E. Benetos, S. Dixon, S. Ahlbäck and P. Ohlsson "Adversarial Unsupervised Domain Adaptation for Harmonic-Percussive Source Separation" in <em>IEEE Signal Processing Letters </em>2020 (under review at time of this writing)</li>
</ul>
This project has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No. 765068.
https://doi.org/10.5281/zenodo.4308731
oai:zenodo.org:4308731
eng
Zenodo
https://zenodo.org/communities/c4dm
https://zenodo.org/communities/sigsep
https://zenodo.org/communities/eu
https://doi.org/10.5281/zenodo.4308730
info:eu-repo/semantics/restrictedAccess
Audio Source Separation
Harmonic-Percussive Source Separation
Scandinavian Fiddle Tunes
Foot-tapping in Fiddle Music
Tap & Fiddle: a Dataset with Scandinavian Fiddle Tunes with Accompanying Foot-Tapping
info:eu-repo/semantics/other
oai:zenodo.org:1649325
2020-11-05T17:27:48Z
user-mir
user-sigsep
user-ismir
openaire_data
Rachel Bittner
Justin Salamon
Mike Tierney
Matthias Mauch
Chris Cannam
Juan Pablo Bello
2014-10-28
<p>Audio files for the MedleyDB multitrack dataset. <strong>Annotation and Metadata files are version controlled and are available in the <a href="https://github.com/marl/medleydb">MedleyDB github</a> repository: </strong><em>Metadata</em> can be found <a href="https://github.com/marl/medleydb/tree/master/medleydb/data/Metadata">here</a>, <em>Annotations</em> can be found <a href="https://github.com/marl/medleydb/tree/master/medleydb/data/Annotations">here</a>.</p>
<p>For detailed information about the dataset, please visit MedleyDB's <a href="http://medleydb.weebly.com">website</a>.</p>
<p> </p>
<p>If you make use of MedleyDB for academic purposes, please cite the following publication:<br>
<br>
<em>R. Bittner, J. Salamon, M. Tierney, M. Mauch, C. Cannam and J. P. Bello, "<a href="http://marl.smusic.nyu.edu/medleydb_webfiles/bittner_medleydb_ismir2014.pdf">MedleyDB: A Multitrack Dataset for Annotation-Intensive MIR Research</a>", in 15th International Society for Music Information Retrieval Conference, Taipei, Taiwan, Oct. 2014.</em></p>
https://doi.org/10.5281/zenodo.1649325
oai:zenodo.org:1649325
eng
Zenodo
https://doi.org/10.5281/zenodo.1438309
https://zenodo.org/communities/ismir
https://zenodo.org/communities/mir
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.1649324
info:eu-repo/semantics/restrictedAccess
multitrack
music
stems
f0 estimation
source separation
instrument identification
pitch tracking
melody estimation
f0
MedleyDB Audio: A Dataset of Multitrack Audio for Music Research
info:eu-repo/semantics/other
oai:zenodo.org:1256003
2020-01-24T19:24:49Z
user-sigsep
openaire_data
Fabian-Robert Stöter
Antoine Liutkus
Nobutaka Ito
2018-05-30
<p>This dataset contains 30s excerpts from the Signal Separation Evaluation Campaign (SiSEC 2018). It accompanies the objective scores as submitted via the <a href="https://github.com/sigsep/sigsep-mus-2018">official github repository. </a>The excerpts were generated using a method as described<a href="https://github.com/sigsep/sigsep-mus-cutlist-generator"> here</a> that selects 30 seconds from each audio track by determining the most active parts of each source.</p>
https://doi.org/10.5281/zenodo.1256003
oai:zenodo.org:1256003
Zenodo
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.1256002
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
SiSEC18-MUS 30s Excerpts
info:eu-repo/semantics/other
oai:zenodo.org:3270814
2020-01-24T19:25:46Z
user-sigsep
openaire_data
Fabian-Robert Stöter
Antoine Liutkus
Nobutaka Ito
2019-07-01
<p>This dataset contains 7s excerpts from the Signal Separation Evaluation Campaign (SiSEC 2018). It accompanies the objective scores as submitted via the <a href="https://github.com/sigsep/sigsep-mus-2018">official github repository. </a>The excerpts were generated using a method as described<a href="https://github.com/sigsep/sigsep-mus-cutlist-generator"> here</a> that selects 7 seconds from each audio track by determining the most active parts of each source.</p>
https://doi.org/10.5281/zenodo.3270814
oai:zenodo.org:3270814
Zenodo
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.1256063
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
source separation, music
SiSEC18-MUS 7s Excerpts
info:eu-repo/semantics/other
oai:zenodo.org:1490097
2020-01-24T19:21:47Z
user-sigsep
openaire_data
Fabian-Robert Stöter
Dominic Ward
Aditya A. Nugraha
2018-11-16
<p>added pandas dataframe</p>
https://doi.org/10.5281/zenodo.1490097
oai:zenodo.org:1490097
Zenodo
https://github.com/faroit/sisec-mus-results/tree/1.0
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.1490096
info:eu-repo/semantics/openAccess
Other (Open)
SiSEC 2016 Submissions
info:eu-repo/semantics/other
oai:zenodo.org:1490095
2020-01-25T07:25:20Z
user-sigsep
software
Fabian-Robert Stöter
Nils Werner
2016-11-01
<p>Source for <a href="http://sisec17.audiolabs-erlangen.de">SiSEC MUS 2016 Website</a></p>
https://doi.org/10.5281/zenodo.1490095
oai:zenodo.org:1490095
Zenodo
https://github.com/faroit/sisec-mus-website/tree/v1.0
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.1490094
info:eu-repo/semantics/openAccess
Other (Open)
SiSEC 2016 Website
info:eu-repo/semantics/other
oai:zenodo.org:3743844
2020-09-02T23:57:23Z
user-sigsep
openaire_data
user-dcase
Romain Serizel
Nicolas Turpault
Justin Salamon
Prem Seetharaman
Eduardo Fonesca
Frederic Font Corbera
Scott Wisdom
Hakan Erdogan
Dan Ellis
John R. Hershey
2020-03-04
<p>The Free Universal Sound Separation (FUSS) Dataset is a database of arbitrary sound mixtures and source-level references, for use in experiments on arbitrary sound separation. </p>
<p>This is the official sound separation data for the DCASE2020 Challenge Task 4: Sound Event Detection and Separation in Domestic Environments.</p>
<p><strong>Citation: </strong>If you use the FUSS dataset or part of it, please cite our paper describing the dataset and baseline [1]. FUSS is based on <a href="https://annotator.freesound.org/fsd/">FSD data</a> so please also cite [2]:</p>
<p><strong>Overview: </strong>FUSS audio data is sourced from a pre-release of <a href="https://annotator.freesound.org/fsd/">Freesound dataset</a> known as (FSD50k), a sound event dataset composed of Freesound content annotated with labels from the AudioSet Ontology. Using the FSD50K labels, these source files have been screened such that they likely only contain a single type of sound. Labels are not provided for these source files, and are not considered part of the challenge. For the purpose of the DCASE Task4 Sound Separation and Event Detection challenge, systems should not use FSD50K labels, even though they may become available upon FSD50K release.</p>
<p>To create mixtures, 10 second clips of sources are convolved with simulated room impulse responses and added together. Each 10 second mixture contains between 1 and 4 sources. Source files longer than 10 seconds are considered "background" sources. Every mixture contains one background source, which is active for the entire duration. We provide: a software recipe to create the dataset, the room impulse responses, and the original source audio.</p>
<p><strong>Motivation for use in DCASE2020 Challenge Task 4: </strong> This dataset provides a platform to investigate how source separation may help with event detection and vice versa. Previous work has shown that universal sound separation (separation of arbitrary sounds) is possible [3], and that event detection can help with universal sound separation [4]. It remains to be seen whether sound separation can help with event detection. Event detection is more difficult in noisy environments, and so separation could be a useful pre-processing step. Data with strong labels for event detection are relatively scarce, especially when restricted to specific classes within a domain. In contrast, source separation data needs no event labels for training, and may be more plentiful. In this setting, the idea is to utilize larger unlabeled separation data to train separation systems, which can serve as a front-end to event-detection systems trained on more limited data.</p>
<p><strong>Room simulation: </strong>Room impulse responses are simulated using the image method with frequency-dependent walls. Each impulse corresponds to a rectangular room of random size with random wall materials, where a single microphone and up to 4 sources are placed at random spatial locations.</p>
<p><strong>Recipe for data creation: </strong>The data creation recipe starts with scripts, based on<a href="https://github.com/justinsalamon/scaper"> scaper</a> [5], to generate mixtures of events with random timing of source events, along with a background source that spans the duration of the mixture clip. The scipts for this are at<a href="https://github.com/google-research/sound-separation/tree/master/datasets/fuss"> this GitHub repo</a>.</p>
<p>The data are reverberated using a different room simulation for each mixture. In this simulation each source has its own reverberation corresponding to a different spatial location. The reverberated mixtures are created by summing over the reverberated sources. The dataset recipe scripts support modification, so that participants may remix and augment the training data as desired.</p>
<p>The constituent source files for each mixture are also generated for use as references for training and evaluation. The dataset recipe scripts support modification, so that participants may remix and augment the training data as desired.</p>
<p>Note: no attempt was made to remove digital silence from the freesound source data, so some reference sources may include digital silence, and there are a few mixtures where the background reference is all digital silence. Digital silence can also be observed in the event recognition public evaluation data, so it is important to be able to handle this in practice. Our evaluation scripts handle it by ignoring any reference sources that are silent. </p>
<p><strong>Format: </strong>All audio clips are provided as uncompressed PCM 16 bit, 16 kHz, mono audio files.</p>
<p><strong>Data split: </strong> The FUSS dataset is partitioned into "train", "validation", and "eval" sets, following the same splits used in FSD data. Specifically, the train and validation sets are sourced from the FSD50K dev set, and we have ensured that clips in train come from different uploaders than the clips in validation. The eval set is sourced from the FSD50K eval split.</p>
<p><strong>Baseline System: </strong>A baseline system for the FUSS dataset is available at <a href="https://github.com/google-research/sound-separation/tree/master/datasets/fuss">dcase2020_fuss_baseline</a>.</p>
<p><strong>License: </strong>All audio clips (i.e., in FUSS_fsd_data.tar.gz) used in the preparation of Free Universal Source Separation (FUSS) dataset are designated Creative Commons (CC0) and were obtained from<a href="http://freesound.org"> freesound.org</a>. The source data in FUSS_fsd_data.tar.gz were selected using labels from the<a href="https://annotator.freesound.org/fsd/"> FSD50K corpus</a>, which is licensed as Creative Commons Attribution 4.0 International (CC BY 4.0) License.</p>
<p>The FUSS dataset as a whole, is a curated, reverberated, mixed, and partitioned preparation, and is released under the Creative Commons Attribution 4.0 International (CC BY 4.0) License. This license is specified in the `LICENSE-DATASET` file downloaded with the `FUSS_license_doc.tar.gz` file.</p>
<p> </p>
added:
- FUSS_baseline_dry_model.tar.gz: baseline separation model trained on non-reverberated (dry) data.
- FUSS_DESED_baseline_dry_2_model.tar.gz:: baseline separation model for the DESED task, trained on a mixture of DESED in-domain data and FUSS data
https://doi.org/10.5281/zenodo.3743844
oai:zenodo.org:3743844
Zenodo
https://zenodo.org/communities/dcase
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.3694383
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
sound separation
Free Universal Sound Separation Dataset
info:eu-repo/semantics/other
oai:zenodo.org:3710392
2020-09-02T23:57:22Z
user-sigsep
openaire_data
user-dcase
Romain Serizel
Nicolas Turpault
Justin Salamon
Prem Seetharaman
Eduardo Fonesca
Frederic Font Corbera
Scott Wisdom
Hakan Erdogan
Dan Ellis
John R. Hershey
2020-03-04
<p>The Free Universal Sound Separation (FUSS) Dataset is a database of arbitrary sound mixtures and source-level references, for use in experiments on arbitrary sound separation. </p>
<p>This is the official sound separation data for the DCASE2020 Challenge Task 4: Sound Event Detection and Separation in Domestic Environments.</p>
<p><strong>Citation: </strong>If you use the FUSS dataset or part of it, please cite our paper describing the dataset and baseline [1]. FUSS is based on <a href="https://annotator.freesound.org/fsd/">FSD data</a> so please also cite [2]:</p>
<p><strong>Overview: </strong>FUSS audio data is sourced from a pre-release of <a href="https://annotator.freesound.org/fsd/">Freesound dataset</a> known as (FSD50k), a sound event dataset composed of Freesound content annotated with labels from the AudioSet Ontology. Using the FSD50K labels, these source files have been screened such that they likely only contain a single type of sound. Labels are not provided for these source files, and are not considered part of the challenge. For the purpose of the DCASE Task4 Sound Separation and Event Detection challenge, systems should not use FSD50K labels, even though they may become available upon FSD50K release.</p>
<p>To create mixtures, 10 second clips of sources are convolved with simulated room impulse responses and added together. Each 10 second mixture contains between 1 and 4 sources. Source files longer than 10 seconds are considered "background" sources. Every mixture contains one background source, which is active for the entire duration. We provide: a software recipe to create the dataset, the room impulse responses, and the original source audio.</p>
<p><strong>Motivation for use in DCASE2020 Challenge Task 4: </strong> This dataset provides a platform to investigate how source separation may help with event detection and vice versa. Previous work has shown that universal sound separation (separation of arbitrary sounds) is possible [3], and that event detection can help with universal sound separation [4]. It remains to be seen whether sound separation can help with event detection. Event detection is more difficult in noisy environments, and so separation could be a useful pre-processing step. Data with strong labels for event detection are relatively scarce, especially when restricted to specific classes within a domain. In contrast, source separation data needs no event labels for training, and may be more plentiful. In this setting, the idea is to utilize larger unlabeled separation data to train separation systems, which can serve as a front-end to event-detection systems trained on more limited data.</p>
<p><strong>Room simulation: </strong>Room impulse responses are simulated using the image method with frequency-dependent walls. Each impulse corresponds to a rectangular room of random size with random wall materials, where a single microphone and up to 4 sources are placed at random spatial locations.</p>
<p><strong>Recipe for data creation: </strong>The data creation recipe starts with scripts, based on<a href="https://github.com/justinsalamon/scaper"> scaper</a> [5], to generate mixtures of events with random timing of source events, along with a background source that spans the duration of the mixture clip. The scipts for this are at<a href="https://github.com/google-research/sound-separation/tree/master/datasets/fuss"> this GitHub repo</a>.</p>
<p>The data are reverberated using a different room simulation for each mixture. In this simulation each source has its own reverberation corresponding to a different spatial location. The reverberated mixtures are created by summing over the reverberated sources. The dataset recipe scripts support modification, so that participants may remix and augment the training data as desired.</p>
<p>The constituent source files for each mixture are also generated for use as references for training and evaluation. The dataset recipe scripts support modification, so that participants may remix and augment the training data as desired.</p>
<p>Note: no attempt was made to remove digital silence from the freesound source data, so some reference sources may include digital silence, and there are a few mixtures where the background reference is all digital silence. Digital silence can also be observed in the event recognition public evaluation data, so it is important to be able to handle this in practice. Our evaluation scripts handle it by ignoring any reference sources that are silent. </p>
<p><strong>Format: </strong>All audio clips are provided as uncompressed PCM 16 bit, 16 kHz, mono audio files.</p>
<p><strong>Data split: </strong> The FUSS dataset is partitioned into "train", "validation", and "eval" sets, following the same splits used in FSD data. Specifically, the train and validation sets are sourced from the FSD50K dev set, and we have ensured that clips in train come from different uploaders than the clips in validation. The eval set is sourced from the FSD50K eval split.</p>
<p><strong>Baseline System: </strong>A baseline system for the FUSS dataset is available at <a href="https://github.com/google-research/sound-separation/tree/master/datasets/fuss">dcase2020_fuss_baseline</a>.</p>
<p><strong>License: </strong>All audio clips (i.e., in FUSS_fsd_data.tar.gz) used in the preparation of Free Universal Source Separation (FUSS) dataset are designated Creative Commons (CC0) and were obtained from<a href="http://freesound.org"> freesound.org</a>. The source data in FUSS_fsd_data.tar.gz were selected using labels from the<a href="https://annotator.freesound.org/fsd/"> FSD50K corpus</a>, which is licensed as Creative Commons Attribution 4.0 International (CC BY 4.0) License.</p>
<p>The FUSS dataset as a whole, is a curated, reverberated, mixed, and partitioned preparation, and is released under the Creative Commons Attribution 4.0 International (CC BY 4.0) License. This license is specified in the `LICENSE-DATASET` file downloaded with the `FUSS_license_doc.tar.gz` file.</p>
<p> </p>
https://doi.org/10.5281/zenodo.3710392
oai:zenodo.org:3710392
Zenodo
https://zenodo.org/communities/dcase
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.3694383
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
sound separation
Free Universal Sound Separation Dataset
info:eu-repo/semantics/other
oai:zenodo.org:3267291
2020-01-25T19:22:11Z
user-sigsep
software
Fabian-Robert Stöter
Antoine Liutkus
2019-08-14
<p>Weights of the UMX-HQ music separation model for pytorch. For more information, visit https://github.com/sigsep/open-unmix-pytorch</p>
https://doi.org/10.5281/zenodo.3267291
oai:zenodo.org:3267291
Zenodo
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.3267290
info:eu-repo/semantics/openAccess
MIT License
https://opensource.org/licenses/MIT
source separation
deep learning
pytorch
Open-Unmix-Pytorch UMX-HQ
info:eu-repo/semantics/other
oai:zenodo.org:3370489
2020-01-25T19:22:13Z
user-sigsep
software
Fabian-Robert Stöter
Antoine Liutkus
2019-08-14
<p>Weights of the UMX-HQ music separation model for pytorch. For more information, visit https://github.com/sigsep/open-unmix-pytorch</p>
https://doi.org/10.5281/zenodo.3370489
oai:zenodo.org:3370489
Zenodo
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.3267290
info:eu-repo/semantics/openAccess
MIT License
https://opensource.org/licenses/MIT
source separation
deep learning
pytorch
Open-Unmix-Pytorch UMX-HQ
info:eu-repo/semantics/other
oai:zenodo.org:3340804
2020-01-25T07:27:37Z
user-sigsep
software
Stöter, Fabian-Robert
Antoine Liutkus
2019-08-14
<p>Weights of the UMX music separation model for pytorch. For more information, visit <a href="https://github.com/sigsep/open-unmix-pytorch">https://github.com/sigsep/open-unmix-pytorch</a></p>
https://doi.org/10.5281/zenodo.3340804
oai:zenodo.org:3340804
Zenodo
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.3340803
info:eu-repo/semantics/openAccess
MIT License
https://opensource.org/licenses/MIT
Open-Unmix-Pytorch UMX
info:eu-repo/semantics/other
oai:zenodo.org:3906389
2020-12-07T11:31:44Z
user-mir
user-sigsep
software
Hennequin Romain
Khlif Anis
Voituret Felix
Moussallam Manuel
2020-06-24
<p><strong>Spleeter</strong> is the <a href="https://www.deezer.com/">Deezer</a> source separation library with pretrained models written in <a href="https://www.python.org/">Python</a> and uses <a href="https://tensorflow.org/">Tensorflow</a>. It makes it easy to train source separation model (assuming you have a dataset of isolated sources), and provides already trained state of the art model for performing various flavour of separation :</p>
<ul>
<li>Vocals (singing voice) / accompaniment separation (<a href="https://github.com/deezer/spleeter/wiki/2.-Getting-started#using-2stems-model">2 stems</a>)</li>
<li>Vocals / drums / bass / other separation (<a href="https://github.com/deezer/spleeter/wiki/2.-Getting-started#using-4stems-model">4 stems</a>)</li>
<li>Vocals / drums / bass / piano / other separation (<a href="https://github.com/deezer/spleeter/wiki/2.-Getting-started#using-5stems-model">5 stems</a>)</li>
</ul>
<p>2 stems and 4 stems models have <a href="https://github.com/deezer/spleeter/wiki/Separation-Performances">high performances</a> on the <a href="https://sigsep.github.io/datasets/musdb.html">musdb</a> dataset. <strong>Spleeter</strong> is also very fast as it can perform separation of audio files to 4 stems 100x faster than real-time when run on a GPU.</p>
<p>We designed <strong>Spleeter</strong> so you can use it straight from <a href="https://github.com/deezer/spleeter/wiki/2.-Getting-started#usage">command line</a> as well as directly in your own development pipeline as a <a href="https://github.com/deezer/spleeter/wiki/4.-API-Reference#separator">Python library</a>. It can be installed with <a href="https://github.com/deezer/spleeter/wiki/1.-Installation#using-conda">Conda</a>, with <a href="https://github.com/deezer/spleeter/wiki/1.-Installation#using-pip">pip</a> or be used with <a href="https://github.com/deezer/spleeter/wiki/2.-Getting-started#using-docker-image">Docker</a>.</p>
https://doi.org/10.5281/zenodo.3906389
oai:zenodo.org:3906389
eng
Zenodo
https://doi.org/10.21105/joss.02154
https://zenodo.org/communities/mir
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.3906388
info:eu-repo/semantics/openAccess
MIT License
https://opensource.org/licenses/MIT
Spleeter: a fast and efficient music source separation tool with pre-trained models
info:eu-repo/semantics/other
oai:zenodo.org:5069601
2021-07-05T13:48:18Z
user-sigsep
openaire_data
Fabian-Robert Stöter
Antoine Liutkus
2021-07-05
<p>Weights of the UMX-L (umxl) music separation model for PyTorch. For more information, visit https://github.com/sigsep/open-unmix-pytorch</p>
https://doi.org/10.5281/zenodo.5069601
oai:zenodo.org:5069601
Zenodo
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.5069600
info:eu-repo/semantics/openAccess
Creative Commons Attribution Non Commercial Share Alike 4.0 International
https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode
source separation
deep learning
pytorch
Open-Unmix-Pytorch UMX-L
info:eu-repo/semantics/other
oai:zenodo.org:3786908
2023-10-29T13:06:55Z
user-sigsep
openaire_data
Uhlich, Stefan
Mitsufuji, Yuki
2020-05-05
<p>Weights of Open-Unmix trained on the 28-speaker version of Voicebank+Demand (Sampling rate: 16kHz). The weights can be used with <a href="https://github.com/sigsep/open-unmix-nnabla">open-unmix-nnabla</a> and <a href="https://github.com/sigsep/open-unmix-pytorch">open-unmix-pytorch</a>.</p>
https://doi.org/10.5281/zenodo.3786908
oai:zenodo.org:3786908
Zenodo
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.3786907
info:eu-repo/semantics/openAccess
MIT License
https://opensource.org/licenses/MIT
Source Separation
Speech Enhancement
NNabla
PyTorch
Open-Unmix for Speech Enhancement (UMX SE)
info:eu-repo/semantics/other
oai:zenodo.org:4704231
2023-10-28T10:54:36Z
user-sigsep
openaire_data
Sawata, Ryosuke
Uhlich, Stefan
Takahashi, Shusuke
Mitsufuji, Yuki
2021-04-20
<p>Weights of CrossNet-Open-Unmix (X-UMX) trained on <a href="https://sigsep.github.io/datasets/musdb.html">MUSDB18</a>. The weights can be used with <a href="https://github.com/asteroid-team/asteroid/tree/master/egs/musdb18/X-UMX">X-UMX on Asteroid (PyTorch)</a>. The details of X-UMX is described in <a href="https://arxiv.org/abs/2010.04228">here</a>.</p>
https://doi.org/10.5281/zenodo.4704231
oai:zenodo.org:4704231
Zenodo
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.4704230
info:eu-repo/semantics/openAccess
MIT License
https://opensource.org/licenses/MIT
Source Separation
Music Source Separation
PyTorch
Asteroid
CrossNet-Open-Unmix for Music Source Separation (X-UMX)
info:eu-repo/semantics/other
oai:zenodo.org:7657264
2023-02-20T14:26:33Z
user-sigsep
openaire_data
Fabian-Robert Stöter
2023-02-20
<p>SDXDB21 Bleeding Baseline</p>
<pre>We split the training data into train and valid. For valid, the following songs were used:</pre>
<pre>bc1f2967-f834-43bd-aadc-95afc897cfe7
cc3e4991-6cce-40fe-a917-81a4fbb92ea6
ed90a89a-bf22-444d-af3d-d9ac3896ebd2
f4b735de-14b1-4091-a9ba-c8b30c0740a7
bc964128-da16-4e4c-af95-4d1211e78c70
cc7f7675-d3c8-4a49-a2d7-a8959b694004
f40ffd10-4e8b-41e6-bd8a-971929ca9138</pre>
<pre>The following commands were used to create the models:</pre>
<pre><code class="language-bash">OMP_NUM_THREADS=1 CUDA_VISIBLE_DEVICES=3 python train.py \
--root /sdxdb23_bleeding_v1.0 \
--dataset trackfolder_fix \
--target-file vocals.wav \
--interferer-files bass.wav drums.wav other.wav \
--random-track-mix \
--lr-decay-patience 160 \
--source-augmentations gain channelswap
OMP_NUM_THREADS=1 CUDA_VISIBLE_DEVICES=4 python train.py \
--root /sdxdb23_bleeding_v1.0 \
--dataset trackfolder_fix \
--target-file bass.wav \
--interferer-files vocals.wav drums.wav other.wav \
--random-track-mix \
--lr-decay-patience 160 \
--source-augmentations gain channelswap
OMP_NUM_THREADS=1 CUDA_VISIBLE_DEVICES=5 python train.py \
--root /sdxdb23_bleeding_v1.0 \
--dataset trackfolder_fix \
--target-file drums.wav \
--interferer-files bass.wav vocals.wav other.wav \
--random-track-mix \
--lr-decay-patience 160 \
--source-augmentations gain channelswap
OMP_NUM_THREADS=1 CUDA_VISIBLE_DEVICES=7 python train.py \
--root /sdxdb23_bleeding_v1.0 \
--dataset trackfolder_fix \
--target-file other.wav \
--interferer-files bass.wav drums.wav vocals.wav \
--random-track-mix \
--lr-decay-patience 160 \
--source-augmentations gain channelswap</code></pre>
<p> </p>
https://doi.org/10.5281/zenodo.7657264
oai:zenodo.org:7657264
Zenodo
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.7657263
info:eu-repo/semantics/openAccess
Creative Commons Attribution Non Commercial Share Alike 4.0 International
https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode
Open-Unmix Pytorch Bleeding
info:eu-repo/semantics/other
oai:zenodo.org:4740378
2023-10-28T10:55:54Z
user-sigsep
openaire_data
Sawata, Ryosuke
Uhlich, Stefan
Takahashi, Shusuke
Mitsufuji, Yuki
2021-05-06
<p>Weights of CrossNet-Open-Unmix (X-UMX) trained on <a href="https://sigsep.github.io/datasets/musdb.html">MUSDB18-HQ</a>. The weights can be used with <a href="https://github.com/asteroid-team/asteroid/tree/master/egs/musdb18/X-UMX">X-UMX on Asteroid (PyTorch)</a>. The details of X-UMX is described in <a href="https://arxiv.org/abs/2010.04228">here</a>.</p>
https://doi.org/10.5281/zenodo.4740378
oai:zenodo.org:4740378
Zenodo
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.4740377
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
Source Separation
Music Source Separation
PyTorch
Asteroid
CrossNet-Open-Unmix for Music Source Separation (X-UMX-HQ)
info:eu-repo/semantics/other
oai:zenodo.org:1256064
2020-01-24T19:25:47Z
user-sigsep
openaire_data
Fabian-Robert Stöter
Antoine Liutkus
Nobutaka Ito
2018-05-30
<p>This dataset contains 7s excerpts from the Signal Separation Evaluation Campaign (SiSEC 2018). It accompanies the objective scores as submitted via the <a href="https://github.com/sigsep/sigsep-mus-2018">official github repository. </a>The excerpts were generated using a method as described<a href="https://github.com/sigsep/sigsep-mus-cutlist-generator"> here</a> that selects 7 seconds from each audio track by determining the most active parts of each source.</p>
https://doi.org/10.5281/zenodo.1256064
oai:zenodo.org:1256064
Zenodo
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.1256063
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
source separation, music
SiSEC18-MUS 7s Excerpts
info:eu-repo/semantics/other
oai:zenodo.org:7657181
2023-02-20T14:26:33Z
user-sigsep
openaire_data
Fabian-Robert Stöter
2023-02-20
<p>SDXDB21 LabelNoise Baseline</p>
<pre>We split the training data into train and valid. For valid, the following songs were used:</pre>
<pre>bc1f2967-f834-43bd-aadc-95afc897cfe7
cc3e4991-6cce-40fe-a917-81a4fbb92ea6
ed90a89a-bf22-444d-af3d-d9ac3896ebd2
f4b735de-14b1-4091-a9ba-c8b30c0740a7
bc964128-da16-4e4c-af95-4d1211e78c70
cc7f7675-d3c8-4a49-a2d7-a8959b694004
f40ffd10-4e8b-41e6-bd8a-971929ca9138</pre>
<pre>
The following commands were used to create the models:</pre>
<pre><code class="language-bash">OMP_NUM_THREADS=1 CUDA_VISIBLE_DEVICES=3 python train.py \
--root /sdxdb23_labelnoise_v1.0 \
--dataset trackfolder_fix \
--target-file vocals.wav \
--interferer-files bass.wav drums.wav other.wav \
--random-track-mix \
--lr-decay-patience 160 \
--source-augmentations gain channelswap
OMP_NUM_THREADS=1 CUDA_VISIBLE_DEVICES=4 python train.py \
--root /sdxdb23_labelnoise_v1.0 \
--dataset trackfolder_fix \
--target-file bass.wav \
--interferer-files vocals.wav drums.wav other.wav \
--random-track-mix \
--lr-decay-patience 160 \
--source-augmentations gain channelswap
OMP_NUM_THREADS=1 CUDA_VISIBLE_DEVICES=5 python train.py \
--root /sdxdb23_labelnoise_v1.0 \
--dataset trackfolder_fix \
--target-file drums.wav \
--interferer-files bass.wav vocals.wav other.wav \
--random-track-mix \
--lr-decay-patience 160 \
--source-augmentations gain channelswap
OMP_NUM_THREADS=1 CUDA_VISIBLE_DEVICES=7 python train.py \
--root /sdxdb23_labelnoise_v1.0 \
--dataset trackfolder_fix \
--target-file other.wav \
--interferer-files bass.wav drums.wav vocals.wav \
--random-track-mix \
--lr-decay-patience 160 \
--source-augmentations gain channelswap</code></pre>
<p> </p>
https://doi.org/10.5281/zenodo.7657181
oai:zenodo.org:7657181
Zenodo
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.7657180
info:eu-repo/semantics/openAccess
Creative Commons Attribution Non Commercial Share Alike 4.0 International
https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode
Open-Unmix-Pytorch LabelNoise
info:eu-repo/semantics/other
oai:zenodo.org:3370486
2020-01-25T07:27:05Z
user-sigsep
software
Stöter, Fabian-Robert
Antoine Liutkus
2019-08-14
<p>Weights of the UMX music separation model for pytorch. For more information, visit <a href="https://github.com/sigsep/open-unmix-pytorch">https://github.com/sigsep/open-unmix-pytorch</a></p>
https://doi.org/10.5281/zenodo.3370486
oai:zenodo.org:3370486
Zenodo
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.3340803
info:eu-repo/semantics/openAccess
MIT License
https://opensource.org/licenses/MIT
Open-Unmix-Pytorch UMX
info:eu-repo/semantics/other
oai:zenodo.org:4012661
2020-09-03T00:59:25Z
user-sigsep
openaire_data
user-dcase
Romain Serizel
Nicolas Turpault
Justin Salamon
Prem Seetharaman
Eduardo Fonesca
Frederic Font Corbera
Scott Wisdom
Hakan Erdogan
Dan Ellis
John R. Hershey
2020-03-04
<p>The Free Universal Sound Separation (FUSS) Dataset is a database of arbitrary sound mixtures and source-level references, for use in experiments on arbitrary sound separation. </p>
<p>This is the official sound separation data for the DCASE2020 Challenge Task 4: Sound Event Detection and Separation in Domestic Environments.</p>
<p><strong>Citation: </strong>If you use the FUSS dataset or part of it, please cite our paper describing the dataset and baseline [1]. FUSS is based on <a href="https://annotator.freesound.org/fsd/">FSD data</a> so please also cite [2]:</p>
<p><strong>Overview: </strong>FUSS audio data is sourced from a pre-release of <a href="https://annotator.freesound.org/fsd/">Freesound dataset</a> known as (FSD50k), a sound event dataset composed of Freesound content annotated with labels from the AudioSet Ontology. Using the FSD50K labels, these source files have been screened such that they likely only contain a single type of sound. Labels are not provided for these source files, and are not considered part of the challenge. For the purpose of the DCASE Task4 Sound Separation and Event Detection challenge, systems should not use FSD50K labels, even though they may become available upon FSD50K release.</p>
<p>To create mixtures, 10 second clips of sources are convolved with simulated room impulse responses and added together. Each 10 second mixture contains between 1 and 4 sources. Source files longer than 10 seconds are considered "background" sources. Every mixture contains one background source, which is active for the entire duration. We provide: a software recipe to create the dataset, the room impulse responses, and the original source audio.</p>
<p><strong>Motivation for use in DCASE2020 Challenge Task 4: </strong> This dataset provides a platform to investigate how source separation may help with event detection and vice versa. Previous work has shown that universal sound separation (separation of arbitrary sounds) is possible [3], and that event detection can help with universal sound separation [4]. It remains to be seen whether sound separation can help with event detection. Event detection is more difficult in noisy environments, and so separation could be a useful pre-processing step. Data with strong labels for event detection are relatively scarce, especially when restricted to specific classes within a domain. In contrast, source separation data needs no event labels for training, and may be more plentiful. In this setting, the idea is to utilize larger unlabeled separation data to train separation systems, which can serve as a front-end to event-detection systems trained on more limited data.</p>
<p><strong>Room simulation: </strong>Room impulse responses are simulated using the image method with frequency-dependent walls. Each impulse corresponds to a rectangular room of random size with random wall materials, where a single microphone and up to 4 sources are placed at random spatial locations.</p>
<p><strong>Recipe for data creation: </strong>The data creation recipe starts with scripts, based on<a href="https://github.com/justinsalamon/scaper"> scaper</a> [5], to generate mixtures of events with random timing of source events, along with a background source that spans the duration of the mixture clip. The scipts for this are at<a href="https://github.com/google-research/sound-separation/tree/master/datasets/fuss"> this GitHub repo</a>.</p>
<p>The data are reverberated using a different room simulation for each mixture. In this simulation each source has its own reverberation corresponding to a different spatial location. The reverberated mixtures are created by summing over the reverberated sources. The dataset recipe scripts support modification, so that participants may remix and augment the training data as desired.</p>
<p>The constituent source files for each mixture are also generated for use as references for training and evaluation. The dataset recipe scripts support modification, so that participants may remix and augment the training data as desired.</p>
<p>Note: no attempt was made to remove digital silence from the freesound source data, so some reference sources may include digital silence, and there are a few mixtures where the background reference is all digital silence. Digital silence can also be observed in the event recognition public evaluation data, so it is important to be able to handle this in practice. Our evaluation scripts handle it by ignoring any reference sources that are silent. </p>
<p><strong>Format: </strong>All audio clips are provided as uncompressed PCM 16 bit, 16 kHz, mono audio files.</p>
<p><strong>Data split: </strong> The FUSS dataset is partitioned into "train", "validation", and "eval" sets, following the same splits used in FSD data. Specifically, the train and validation sets are sourced from the FSD50K dev set, and we have ensured that clips in train come from different uploaders than the clips in validation. The eval set is sourced from the FSD50K eval split.</p>
<p><strong>Baseline System: </strong>A baseline system for the FUSS dataset is available at <a href="https://github.com/google-research/sound-separation/tree/master/datasets/fuss">dcase2020_fuss_baseline</a>.</p>
<p><strong>License: </strong>All audio clips (i.e., in FUSS_fsd_data.tar.gz) used in the preparation of Free Universal Source Separation (FUSS) dataset are designated Creative Commons (CC0) and were obtained from<a href="http://freesound.org"> freesound.org</a>. The source data in FUSS_fsd_data.tar.gz were selected using labels from the<a href="https://annotator.freesound.org/fsd/"> FSD50K corpus</a>, which is licensed as Creative Commons Attribution 4.0 International (CC BY 4.0) License.</p>
<p>The FUSS dataset as a whole, is a curated, reverberated, mixed, and partitioned preparation, and is released under the Creative Commons Attribution 4.0 International (CC BY 4.0) License. This license is specified in the `LICENSE-DATASET` file downloaded with the `FUSS_license_doc.tar.gz` file.</p>
<p><strong>Notes:</strong></p>
<p>Added in v1.2: </p>
<ul>
<li>FUSS_baseline_dry_model.tar.gz: baseline separation model trained on non-reverberated (dry) data. </li>
<li>FUSS_DESED_baseline_dry_2_model.tar.gz:: baseline separation model for the DESED task, trained on a mixture of DESED in-domain data and FUSS data</li>
</ul>
<p>Added in v1.3:</p>
<ul>
<li>FUSS_DESED_baseline_dry_1_model.tar.gz: baseline separation model for the DESED task, trained to separate DESED mixtures from dry FUSS mixtures (DmFm)</li>
<li>FUSS_DESED_baseline_dry_4_model.tar.gz: baseline separation model for the DESED task, trained to separate DESED background, dry FUSS mixture, and 5 DESED foreground sources with PIT (PIT)</li>
<li>FUSS_DESED_baseline_dry_4np_model.tar.gz: baseline separation model for the DESED task, trained to separate DESED background, 10 DESED classes, and dry FUSS mixture without PIT (Classwise)</li>
<li>FUSS_DESED_baseline_dry_6_model.tar.gz: baseline separation model for the DESED task, trained to separate DESED background, 5 DESED foreground sources, 4 dry FUSS sources, with groupwise PIT (GroupPIT)</li>
</ul>
<p>The names in parentheses are the task names from Table 3 of the following paper: <a href="https://arxiv.org/pdf/2007.03932.pdf">Nicolas Turpault, Scott Wisdom, Hakan Erdogan, John R. Hershey, Romain Serizel, Eduardo Fonseca, Prem Seetharaman, and Justin Salamon, "Improving Sound Event Detection in Domestic Environments using Sound Separation", DCASE 2020.</a></p>
https://doi.org/10.5281/zenodo.4012661
oai:zenodo.org:4012661
Zenodo
https://zenodo.org/communities/dcase
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.3694383
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
sound separation
Free Universal Sound Separation Dataset
info:eu-repo/semantics/other
oai:zenodo.org:3694384
2020-09-02T23:57:22Z
user-sigsep
openaire_data
user-dcase
Romain Serizel
Nicolas Turpault
Justin Salamon
Prem Seetharaman
Eduardo Fonesca
Frederic Font Corbera
Scott Wisdom
Hakan Erdogan
Dan Ellis
John R. Hershey
2020-03-04
<p>The Free Universal Sound Separation (FUSS) Dataset is a database of arbitrary sound mixtures and source-level references, for use in experiments on arbitrary sound separation. </p>
<p>This is the official sound separation data for the DCASE2020 Challenge Task 4: Sound Event Detection and Separation in Domestic Environments.</p>
<p><strong>Citation: </strong>If you use the FUSS dataset or part of it, please cite our paper describing the dataset and baseline [1]. FUSS is based on <a href="https://annotator.freesound.org/fsd/">FSD data</a> so please also cite [2]:</p>
<p><strong>Overview: </strong>FUSS audio data is sourced from a pre-release of <a href="https://annotator.freesound.org/fsd/">Freesound dataset</a> known as (FSD50k), a sound event dataset composed of Freesound content annotated with labels from the AudioSet Ontology. Using the FSD50K labels, these source files have been screened such that they likely only contain a single type of sound. Labels are not provided for these source files, and are not considered part of the challenge. For the purpose of the DCASE Task4 Sound Separation and Event Detection challenge, systems should not use FSD50K labels, even though they may become available upon FSD50K release.</p>
<p>To create mixtures, 10 second clips of sources are convolved with simulated room impulse responses and added together. Each 10 second mixture contains between 1 and 4 sources. Source files longer than 10 seconds are considered "background" sources. Every mixture contains one background source, which is active for the entire duration. We provide: a software recipe to create the dataset, the room impulse responses, and the original source audio.</p>
<p><strong>Motivation for use in DCASE2020 Challenge Task 4: </strong> This dataset provides a platform to investigate how source separation may help with event detection and vice versa. Previous work has shown that universal sound separation (separation of arbitrary sounds) is possible [3], and that event detection can help with universal sound separation [4]. It remains to be seen whether sound separation can help with event detection. Event detection is more difficult in noisy environments, and so separation could be a useful pre-processing step. Data with strong labels for event detection are relatively scarce, especially when restricted to specific classes within a domain. In contrast, source separation data needs no event labels for training, and may be more plentiful. In this setting, the idea is to utilize larger unlabeled separation data to train separation systems, which can serve as a front-end to event-detection systems trained on more limited data.</p>
<p><strong>Room simulation: </strong>Room impulse responses are simulated using the image method with frequency-dependent walls. Each impulse corresponds to a rectangular room of random size with random wall materials, where a single microphone and up to 4 sources are placed at random spatial locations.</p>
<p><strong>Recipe for data creation: </strong>The data creation recipe starts with scripts, based on<a href="https://github.com/justinsalamon/scaper"> scaper</a>, to generate mixtures of events with random timing of source events, along with a background source that spans the duration of the mixture clip. The scipts for this are at<a href="https://github.com/google-research/sound-separation/tree/master/datasets/fuss"> this GitHub repo</a>.</p>
<p>The data are reverberated using a different room simulation for each mixture. In this simulation each source has its own reverberation corresponding to a different spatial location. The reverberated mixtures are created by summing over the reverberated sources. The dataset recipe scripts support modification, so that participants may remix and augment the training data as desired.</p>
<p>The constituent source files for each mixture are also generated for use as references for training and evaluation. The dataset recipe scripts support modification, so that participants may remix and augment the training data as desired.</p>
<p>Note: no attempt was made to remove digital silence from the freesound source data, so some reference sources may include digital silence, and there are a few mixtures where the background reference is all digital silence. Digital silence can also be observed in the event recognition public evaluation data, so it is important to be able to handle this in practice. Our evaluation scripts handle it by ignoring any reference sources that are silent. </p>
<p><strong>Format: </strong>All audio clips are provided as uncompressed PCM 16 bit, 16 kHz, mono audio files.</p>
<p><strong>Data split: </strong> The FUSS dataset is partitioned into "train", "validation", and "eval" sets, following the same splits used in FSD data. Specifically, the train and validation sets are sourced from the FSD50K dev set, and we have ensured that clips in train come from different uploaders than the clips in validation. The eval set is sourced from the FSD50K eval split.</p>
<p><strong>Baseline System: </strong>A baseline system for the FUSS dataset is available at <a href="https://github.com/google-research/sound-separation/tree/master/datasets/fuss">dcase2020_fuss_baseline</a>.</p>
<p><strong>License: </strong>All audio clips (i.e., in FUSS_fsd_data.tar.gz) used in the preparation of Free Universal Source Separation (FUSS) dataset are designated Creative Commons (CC0) and were obtained from<a href="http://freesound.org"> freesound.org</a>. The source data in FUSS_fsd_data.tar.gz were selected using labels from the<a href="https://annotator.freesound.org/fsd/"> FSD50K corpus</a>, which is licensed as Creative Commons Attribution 4.0 International (CC BY 4.0) License.</p>
<p>The FUSS dataset as a whole, is a curated, reverberated, mixed, and partitioned preparation, and is released under the Creative Commons Attribution 4.0 International (CC BY 4.0) License. This license is specified in the `LICENSE-DATASET` file downloaded with the `FUSS_license_doc.tar.gz` file.</p>
<p>Note the links to the github repo in FUSS_license_doc/README.md are currently out of date, so please refer to FUSS_license_doc/README.md in <a href="https://github.com/google-research/sound-separation/tree/master/datasets/fuss">this GitHub repo</a> which is more recently updated.</p>
<p> </p>
https://doi.org/10.5281/zenodo.3694384
oai:zenodo.org:3694384
Zenodo
https://zenodo.org/communities/dcase
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.3694383
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
sound separation
Free Universal Sound Separation Dataset
info:eu-repo/semantics/other
oai:zenodo.org:1715175
2020-11-05T17:27:45Z
user-mir
user-sigsep
user-ismir
openaire_data
Rachel Bittner
Julia Wilkins
Hanna Yip
Juan Pablo Bello
2016-08-11
<p>Audio files for the second release of the MedleyDB multitrack dataset (MedleyDB 2.0). <strong>Annotation and Metadata files are version controlled and are available in the <a href="https://github.com/marl/medleydb">MedleyDB github</a> repository: </strong><em>Metadata</em> can be found <a href="https://github.com/marl/medleydb/tree/master/medleydb/data/Metadata">here</a>, <em>Annotations</em> can be found <a href="https://github.com/marl/medleydb/tree/master/medleydb/data/Annotations">here</a>.</p>
<p>*THE MULTITRACKS IN THIS RELEASE ARE NEW. THIS DOES NOT CONTAIN ANY OF THE MULTITRACKS FROM THE <a href="https://zenodo.org/record/1438309#.XG8YmOJKhTY">ORIGINAL RELEASE OF MEDLEYDB</a>!*</p>
<p>For detailed information about the dataset, please visit MedleyDB's <a href="http://medleydb.weebly.com/">website</a>.</p>
<p> </p>
<p>If you make use of MedleyDB 2.0 for academic purposes, please cite the following white paper:</p>
<p><em>Rachel M. Bittner, Julia Wilkins, Hanna Yip and Juan P. Bello, "<a href="https://rachelbittner.weebly.com/uploads/3/2/1/8/32182799/bittner_ismirlbd-mdb_2016.pdf">MedleyDB 2.0: New Data and a System for Sustainable Data Collection</a>" Late breaking/demo extended abstract, 17th International Society for Music Information Retrieval (ISMIR) conference, August 2016.</em></p>
https://doi.org/10.5281/zenodo.1715175
oai:zenodo.org:1715175
eng
Zenodo
https://doi.org/10.5281/zenodo.1649325
https://doi.org/10.5281/zenodo.1438309
https://zenodo.org/communities/ismir
https://zenodo.org/communities/mir
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.1715174
info:eu-repo/semantics/restrictedAccess
multitrack
music
stems
source separation
instrument identification
MedleyDB 2.0 Audio
info:eu-repo/semantics/other
oai:zenodo.org:3338373
2021-02-14T09:26:41Z
user-sigsep
openaire_data
Rafii, Zafar
Liutkus, Antoine
Stöter, Fabian-Robert
Mimilakis, Stylianos Ioannis
Bittner, Rachel
2019-08-01
<p>MUSDB18-HQ is the uncompressed version of the MUSDB18 dataset. It consists of a total of 150 full-track songs of different styles and includes both the stereo mixtures and the original sources, divided between a training subset and a test subset.</p>
<p>Its purpose is to serve as a reference database for the design and the evaluation of source separation algorithms. The objective of such signal processing methods is to estimate one or more sources from a set of mixtures, e.g. for karaoke applications. It has been used as the official dataset in the professionally-produced music recordings task for SiSEC 2018, which is the international campaign for the evaluation of source separation algorithms.</p>
<p><em>musdb18-hq</em> contains two folders, a folder with a training set: “train”, composed of 100 songs, and a folder with a test set: “test”, composed of 50 songs. Supervised approaches should be trained on the training set and tested on both sets.</p>
<p>All files from the <em>musdb18-hq</em> dataset are saved as uncompressed wav files. Within each track folder, the user finds</p>
<ul>
<li>mixture.wav</li>
<li>drums.wav</li>
<li>bass.wav,</li>
<li>other.wav,</li>
<li>vocals.wav</li>
</ul>
<p>All signals are stereophonic and encoded at 44.1kHz.</p>
<p><strong>LICENSE</strong></p>
<p>MUSDBHQ: is provided for educational purposes only and the material contained in them should not be used for any commercial purpose without the express permission of the copyright holders:</p>
<p> 100 tracks are taken from the DSD100 data set, which is itself derived from The ‘Mixing Secrets’ Free Multitrack Download Library. Please refer to this original resource for any question regarding your rights on your use of the DSD100 data.<br>
46 tracks are taken from the MedleyDB licensed under Creative Commons (BY-NC-SA 4.0).<br>
2 tracks were kindly provided by Native Instruments originally part of their stems pack.<br>
2 tracks a from from the Canadian rock band The Easton Ellises as part of the heise stems remix competition, licensed under Creative Commons (BY-NC-SA 3.0).</p>
<p><strong>REFERENCE</strong></p>
<p>If you use the MUSDB dataset for your research - Cite the MUSDB18 Dataset</p>
<pre><code>@misc{MUSDB18HQ,
author = {Rafii, Zafar and
Liutkus, Antoine and
Fabian-Robert St{\"o}ter and
Mimilakis, Stylianos Ioannis and
Bittner, Rachel},
title = {{MUSDB18-HQ} - an uncompressed version of MUSDB18},
month = dec,
year = 2019,
doi = {10.5281/zenodo.3338373},
url = {https://doi.org/10.5281/zenodo.3338373}
}</code></pre>
<p>If compare your results with SiSEC 2018 Participants - Cite the SiSEC 2018 LVA/ICA Paper</p>
<pre><code>@inproceedings{SiSEC18,
author="St{\"o}ter, Fabian-Robert and Liutkus, Antoine and Ito, Nobutaka",
title="The 2018 Signal Separation Evaluation Campaign",
booktitle="Latent Variable Analysis and Signal Separation:
14th International Conference, LVA/ICA 2018, Surrey, UK",
year="2018",
pages="293--305"
}</code></pre>
<p> </p>
__LICENSE AGREEMENT__: MUSDB18HQ is provided for educational purposes only and the material contained in them should not be used for any commercial purpose without the express permission of the copyright holders.
https://doi.org/10.5281/zenodo.3338373
oai:zenodo.org:3338373
Zenodo
https://zenodo.org/communities/sigsep
https://doi.org/10.5281/zenodo.3338372
info:eu-repo/semantics/openAccess
Other (Non-Commercial)
MUSDB18-HQ - an uncompressed version of MUSDB18
info:eu-repo/semantics/other