Dataset Open Access

Arab-Andalusian music corpus

Chaachoo, Amin; Sordo, Mohamed; Pretto, Niccolo; Caro Repetto, Rafael; Bozkurt, Baris; Serra, Xavier

This repository contains Arab-Andalusian corpus collected in the CompMusic project.

The following files are available for 164 concert recordings (overall playable time more than 125 hours):

- Audio in mp3 format (44.1kHz or 48 kHz sampling, 128 Kbps and higher, mono or stereo)

- Score in music xml format (manual transcriptions by the first author)

- Automatically computed pitch (text format) and pitch distribution (json format) descriptors

The meta data of the recordings (title, form, mizan, nawba and tab) are provided in separate json files. The corresponding MusicBrainz collection is available at this link. Metadata is subject to improvements as it is collected via crowdsourcing on MusicBrainz. We gather and share new versions of the meta data (for the same audio content) at this link.

The lyrics for the recordings are available from the Arab-Andalusian music lyrics dataset.

For more information, please refer to

A scientific publication making use of this database is available here: Nawba Recognition for Arab-Andalusian Music Using Templates From Music Scores.

This work also received the support of the Musical Bridges project, funded by RecerCaixa.
Files (8.9 GB)
Name Size
8.9 GB Download
All versions This version
Views 324324
Downloads 143143
Data volume 1.3 TB1.3 TB
Unique views 285285
Unique downloads 9494


Cite as