Published June 18, 2018 | Version 2.0
Dataset Open

Arab-Andalusian music lyrics dataset

Description

The dataset contains lyrics for the songs in the Arab-Anadalusian music collection curated within the CompMusic project, that belong to the nawbas "Isbahan", "Maya”, “Raml Maya”, “Gharibat al-Husayn”, “Hijaz Kabir”, “Hijaz Msharqi”, “Istihlal”, “Rasd”, and ”Rasd Dayl”.

Lyrics are stored in two formats: as Tab Separated Values (TSV) files and as JSON files.

Each file is identified by its MusicBrainz recording ID (MBID).

The lyrics are stored both in their original Arabic script (folder 'original') and a romanized/transliterated version (folder 'transliterated') using the American Library of Congress (ALA-LC standard).

Corresponding audio files are available from the Arab-Andalusian music corpus, as well as the Internet Archive URL included in the metadata file ('metadata.csv').

For more information about the exact format and contents of the dataset, please consult the README provided in the archive.

For more information, please refer to http://compmusic.upf.edu/corpora.

Notes

New version prepared by Alia Morsi.

Files

Sanas_v2.zip

Files (989.4 kB)

Name Size Download all
md5:e61ba692fa9b3ddfad95e8326ac799cf
989.4 kB Preview Download

Additional details

Funding

COMPMUSIC – Computational models for the discovery of the world's music 267583
European Commission