There is a newer version of this record available.

Dataset Restricted Access

The DALI dataset

Meseguer Brocal, Gabriel

Cohen-Hadria, Alice; Peeters, Geoffroy

The DALI dataset is a large Dataset of synchronised audio, lyrics and notes for the audio full-duration, with – its time-aligned lyrics and – its time-aligned notes (of the vocal melody). Lyrics are described according to four levels of granularity: notes (and textual information un- derlying a given note), words, lines and paragraphs. For each song, we also provide additional multimodal information such as genre, language, musician, album covers or links to video clips. 


Go to where you can find all the tools to work with the DALI dataset and a detailed description of how to use it.


Cite the paper:

  title={Dali: A large dataset of synchronized audio, lyrics and notes, automatically created using teacher-student machine learning paradigm},
  author={Meseguer-Brocal, Gabriel and Cohen-Hadria, Alice and Peeters, Geoffroy},
  journal={arXiv preprint arXiv:1906.10606},


This research has received funding from the French National Research Agency under the contract ANR-16-CE23-0017-01 (WASABI project) 

Restricted Access

You may request access to the files in this upload, provided that you fulfil the conditions below. The decision whether to grant/deny access is solely under the responsibility of the record owner.

DALI by Gabriel Meseguer-Brocal, Alice Cohen-Hadrian and Peeters Geoffroy. DALI is offered free of charge for non-commercial research use only under the terms of the Creative Commons Attribution Noncommercial License:

The DALI is provided for educational purposes only and the material contained in them should not be used for any commercial purpose without the express permission of the copyright holders.


Please provide your affiliation and planned application in the justification message.


All versions This version
Views 6,2344,607
Downloads 600349
Data volume 1.3 TB34.5 GB
Unique views 3,9733,322
Unique downloads 395270


Cite as