MUSDB18 lyrics extension

Kilian Schulze-Forster; Clement S. J. Doire; Gaël Richard; Roland Badeau

doi:10.5281/zenodo.3989267

Published March 15, 2021 | Version 1.0.0

Dataset Open

MUSDB18 lyrics extension

1. LTCI, Télécom Paris, Institut Polytechnique de Paris,
2. Sonos Inc., Paris, France
3. LTCI, Télécom Paris, Institut Polytechnique de Paris

This is a set of annotated lyrics transcripts for songs belonging to the MUSDB18 dataset. The set comprises lyrics of all songs which have English lyrics, i.e. 96 out of 100 songs for the training set and 45 out of 50 songs for the test set. MUSDB18 is a dataset for music source separation and provides the following separated tracks for each song: vocals, bass, drums, other (rest of the accompaniment), mixture.

The lyrics transcripts, together with the audio files of MUSDB18, are a valuable resource for research on tasks such as text-informed singing voice separation, automatic lyrics alignment, automatic lyrics transcription, and singing voice synthesis and analysis. The provided data should be used for research purposes only.

Disclaimer

The lyrics were transcribed manually by the authors who are not native English speakers. It is likely that the transcriptions are not 100% correct. The composers of the songs are the copyright holders of the original lyrics.

The songs were divided into sections of lengths between 3 and 12 seconds. The priority when choosing the section boundaries was that they correspond to natural pauses and do not cut vocal sounds. The sections do not necessarily correspond to lyrically meaningful lines. Most of the sections do not overlap, some have an overlap of 1 second. In some difficult cases, e.g. shouting in metal songs or mumbled words, where the words are barely intelligible, we made an effort to make the transcriptions as accurate as possible phonetically and did not prioritize semantically meaningful phrases.

Citation

The dataset was built for the paper

Schulze-Forster, K., Doire, C., Richard, G., & Badeau, R. "Phoneme Level Lyrics Alignment and Text-Informed Singing Voice Separation." IEEE/ACM Transactions on Audio, Speech and Language Processing (2021).

If you use the data for your research, please cite the corresponding paper:

@article{schulze2021phoneme,
  title={Phoneme Level Lyrics Alignment and Text-Informed Singing Voice Separation},
  author={Schulze-Forster, Kilian and Doire, Clement and Richard, Ga{\"e}l and Badeau, Roland},
  journal={IEEE/ACM Transactions on Audio, Speech and Language Processing},
  year={2021},
  publisher={IEEE}
}

Annotations

For each section, the annotations comprise: the start and end time, the corresponding lyrics, and a label indicating one of the following four properties:

(a) only one person is singing
(b) several singers are pronouncing the same phonemes at the same time (possibly singing different notes)
(c) several singers are pronouncing different phonemes simultaneously (possibly singing different notes)
(d) no singing

Segments that are labelled with the property (b) or (c) do not necessarily have this property over the whole segment duration. As soon as somewhere in a segment several singers are present, label (b) was assigned; as soon as they sung different phonemes somewhere at the same time, label (c) was assigned. Property (a) and (d) are valid for the entire segment. Furthermore, segments with property (c) can contain either some (lead) singer(s) singing some words in the presence of background singers singing long vowels such as ’ah’ or ’oh’ or they can contain multiple singers who sing different words at the same time. In the latter case, it was very difficult to recognise the sung words and to decide in which order to transcribe words or phrases sung simultaneously. These segments are marked with a '*' and it is recommended to reject them for most use cases.

The annotations have the following format:
<start time> <end time> <vocals property> <lyrics>

Example:
00:18 00:23 a i know the reasons why --> starts at 18 sec., ends at 23 sec., vocals type (a), lyrics: i know the reasons why

The Python script musdb_lyrics_cut_audio.py is provided to automatically cut the MUSDB songs into the annotated segments. The script requires the musdb and soundfile package. The user needs to update the paths and select the desired sources and vocals types in lines 19-26. The script saves wav-files for each selected source for each annotated segment as well as the corresponding lyrics as txt-file. The MUSDB training partition is divided into a training and validation set. The tracks for the validation set can be changed below line 29.

The file words_and_phonemes.txt contains a list of all words and their decomposition into phonemes. The phonemes are written in 2-letter ARPABET style and obtained with the LOGIOS Lexicon Tool.

License

The data is licensed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. To view a copy of this license, read the provided LICENSE.txt file, visit https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.

The creators of MUSDB18 lyrics extension and their corresponding affiliation institutes are not liable for, and expressly exclude, all liability for loss or damage however and whenever caused to anyone by any use of MUSDB18 lyrics extension or any part of it.

Acknowledgment

The authors would like to thank Olumide Okubadejo and Sinead Namur for their help with transcribing and correcting part of the lyrics.

Files

LICENSE.txt

Files (215.1 kB)

Name	Size	Download all
LICENSE.txt md5:fb5d051e53001fdff7fec0f368f47190	20.8 kB	Preview Download
musdb_lyrics_cut_audio.py md5:4df81b061c0efc65bb44a9efc0defb26	9.3 kB	Download
README.txt md5:f898c0a6cbd1cdb888088da91c53f7b5	5.1 kB	Preview Download
test_lyrics.zip md5:1fadfcc287a0cbd319f7ddf6e09b78bb	39.7 kB	Preview Download
train_lyrics.zip md5:dc89e2175edb94eca26dd504a2eef1c7	74.0 kB	Preview Download
words_and_phonemes.txt md5:201b26cbca65d0e2be385d184b52d244	66.1 kB	Preview Download

Additional details

European Commission
MIP-Frontiers - New Frontiers in Music Information Processing 765068

	All versions	This version
Views	2,526	2,523
Downloads	1,097	1,097
Data volume	69.8 MB	69.8 MB

MUSDB18 lyrics extension

Authors/Creators

Description

Files

LICENSE.txt

Files (215.1 kB)

Additional details

Funding