Published June 21, 2024 | Version v2
Dataset Open

Musdb-XL-train

  • 1. Seoul National University

Description

Here, we present the musdb-XL-train dataset for training De-Limiter networks.

 

%%% Important Notes (2024-06-21) %%%

We recently discovered some errors in the musdb-XL-train dataset. Specifically, about 7% of the training data (ozone_seg_0.wav ~ ozone_seg_20000.wav) had slight phase shift problems. If you are already using the musdb-XL-train dataset, please download the updated version. Sorry for the inconvenience. 

%%%%%%%%%%%%%%%%%%%%%%

 

 

The musdb-XL-train dataset consists of a limiter-applied 300,000 segments of 4-sec audio segments and the 100 original songs. For each segment, we randomly chose arbitrary segment in 4 stems (vocals, bass, drums, other) of musdb-HQ training subset and randomly mixed them. Then, we applied a commercial limiter plug-in to each stem.

 

Once you finish the download, you have to unzip it. The data is about 200~210GB so please be sure to make enough space.

Due to the copyright issue, the dataset contains the sample-wise gain parameters (in .npy files), instead of a wave file itself, to make each wave file of musdb-XL-train data from the musdb18-HQ dataset. You should first prepare the musdb18-HQ dataset (https://zenodo.org/record/3338373). With the musdb18-HQ and this downloaded data (.npy and .csv), run the data processing code in our GitHub (https://github.com/jeonchangbin49/De-limiter, Please check the 'Musdb-XL-train' section). Then, you can get the actual wave files of musdb-XL-train data. After finishing the data processing step, you can remove the "np_ratio" folder that contains the sample-wise gain ratio parameters but you should keep your csv files because they will be used in our training process. 

 

Notice that our previous musdb-XL (https://zenodo.org/record/7041331) data is an evaluation dataset, and musdb-XL-train is a training dataset.

 

--Dataset Construction

For a commercial limiter plug-in, we used the iZotope Ozone 9 Maximizer, following our previous work, musdb-XL, which is a mastering-finished (in terms of a limiter, not an EQ) version of musdb-HQ test subset.

The threshold parameters (related to the amount of a limiter operated) of the Ozone 9 Maximizer were chosen targeting the randomly selected loudness that sampled from the Gaussian distribution (mean -8, std 1). Parameters of the Gaussian distribution were selected following statistics of recent pop music (Refer the Table 1. of our previous paper, https://arxiv.org/abs/2208.14355).

The character parameters (related to the attack and release parameters) of the limiter were randomly sampled from the gamma distribution (a=2, scale=1, in https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gamma.html). 

The information on random mix parameters (gain and channel swap) is contained as csv files in our dataset.

 

 

Files

Files (18.2 GB)

Name Size Download all
md5:417b5f4f4f29596a7bb148cefb088de6
18.2 GB Download