Published May 6, 2025 | Version v2
Dataset Open

CANTO-JRP Dataset: Audio Pitch Extractions from the Josquin Research Project

  • 1. ROR icon Utrecht University

Description

The CANTO-JRP Dataset is based on compositions from the Josquin Research Project that were available on Spotify at the time this dataset was created. Due to copyright restrictions, the recordings are not publicly available. However, the dataset includes multiple f0 estimations (from various models), symbolic encodings, and metadata. CANTO-JRP is part of the CANTOSTREAM project. On Spotify, we created the CANTO-JRP playlist that is fully aligned with this dataset.

Please find a detailed description of the dataset in the README.md.

 

This dataset is described in our article:

Visscher, M., & Wiering, F. (2025). Fuzzy Frequencies: Finding Tonal Structures in Audio Recordings of Renaissance Polyphony. Heritage8(5), 164. https://doi.org/10.3390/heritage8050164

If you use this dataset for your reasearch please cite this paper.

Technical info (English)

General

This README.md file provides an overview of the CANTO-JRP Dataset, which is based on compositions from the Josquin Research Project, limited to those available on Spotify at the time the dataset was created. Due to copyright restrictions, the recordings themselves are not publicly available. Instead, the dataset includes multiple f0 estimations (from various models), symbolic encodings, and metadata.

Folder Structure

Each set of multiple f0 extractions is stored in its own folder. Due to their large size, these folders are compressed into separate tar.gz files. To use the data with the accompanying code from GitHub, download and extract the relevant folders into FuzzyFrequencies/data/raw/CANTO-JRP/. If you're only interested in the Multif0 extractions, you can download just the experiment_metadata.csv and the multif0.tar.gzip.

Folders and files in this dataset

  • 195f
  • 214c
  • basicpitch
  • MT3
  • multif0
  • symbolic
  • experiment_metadata.csv
  • README.md

Description of dataset items

This section provides an overview of the data in each folder, how the data should be used, and its purpose.

Folder File types Source number of files format
195f Multipitch extractions, model 195f Fuzzy Frequencies 637 CSV
214c Multipitch extractions, model 214c Fuzzy Frequencies 637 CSV
basicpitch Basicpitch extractions Fuzzy Frequencies 637 CSV
MT3 MT3 extractions Fuzzy Frequencies 172 MIDI
multif0 Multif0 extractions Fuzzy Frequencies 637 CSV
symbolic Symbolic encodings Josquin Research Project (JRP) 637 MusicXML

Note The number of MT3 extractions is lower than those from the other models. Due to the MT3 model’s lower performance on our dataset and the high computational cost, we only processed audio files smaller than approximately 10 MB.

File Types

Metadata

The file experiment_metadata.csv file contains information about each composition from the JRP that was available on Spotify at the time this dataset was created. This file serves both as a reference for users of the dataset and as a specification file for the GitHub code.

Field Format Description
id integer row identifier
nr_playlist string position(s) in the playlist
composer string composer's surname
composition string name of the composition
voices integer number of voices
experiment string experiment name, needed for the code
performer string performer(s) of the recording
Album string album name of the recording
year_recording integer year of recording
audio_final integer MIDI tone of the lowest note of the final chord
symbolic_is_audio string extent to which recording and encoding are the same (yes, almost, no)
instrumentation string instrumentation v(ocal) and (i)nstrumental
instrumentation_category string category of instrumentation (vocal, instrumental, mixed)
final_safe string extent to which the audio final is the same as the (transposed) encoded final (yes, no, pitch class profile)
not repeated string whether there is repetition of the encoding in the recording (yes, no)
repetitions string rough specification of the repetitions
comments string extra comments, mainly instruments used
symbolic string file name of the symbolic encoding
audio string file name of the recording
mf0 string file name of the Multif0 extraction
basicpitch string file name of the Basicpitch extraction
multipitch_214c string file name of the Multipitch extraction, model 214c
multipitch_195f string file name of the Multipitch extraction, model 195f
MT3 string file name of the MT3 extraction

Multipitch extraction

The Multipitch extractions include a column for each MIDI tone, with cell values representing the loudness of the pitch at a given timestamp.

Field Format Description
[empty] integer timestamp index, sample rate = 43.06640625
1 float loudness of MIDI tone 1 + 24 = 25
.. .. ..
71 float loudness of MIDI tone 71 + 24 = 85

Basicpitch extraction

The Basicpitch extractions include a row for each detected note and its corresponding loudness.

Field Format Description
start_time_s float start time of the pitch in seconds
end_time_s float end time of the pitch in seconds
pitch_midi integer MIDI tone of the pitch
velocity integer MIDI equivalent of loudness
pitch_bend inteher multiple columns of microtonal pitch deviations

MT3 extraction

The MT3 extractions are provided in MIDI format (Musical Instrument Digital Interface). MIDI is an industry standard music technology protocol used to represent musical data and allow communication between musical devices. For more details, see the MIDI specifications.

Multif0 extraction

The Multif0 extractions do not have meaningful headers; the first column contains the timestamps, the subsequent columns contain 'voice' columns, without voice leading. By default, the leftmost voice column contains the lowest detected frequency.

Field Format Description
0.0 float timestamp in seconds, time sample rate = 86.1328125
[empty] float frequency of the lowest voice at that time stamp, frequency sample rate = 20 cents
.. .. ..
[empty] float frequency of the highest voice at that time stamp, frequency sample rate = 20 cents

Symbolic encoding

The symbolic encodings are provided in MusicXML format. For an introduction to this format, please see the MusicXML tutorial

Codebook

In this section, we specify for each file type how the data was collected or created.

For 611 out of the 902 works on the JRP website, usable recordings have been found on Spotify; these are collected in the Spotify playlist.

The Basicpitch extractions are created by applying the model by Bittner et al. (2022) [1] to the set of audio recordings.

The Multipitch extractions are created by applying the model by Weiß and Müller (2024) [4] to the set of audio recordings, with model 214c and 195f.

The MT3 extractions are created by extracting the audio files smaller than ~110 MB using the Colab notebook provided by Gardner et al (2022) [3].

The Multif0 extractions are created by applying the model by Cuesta et al. (2020) [2] to the audio files.

The symbolic encodings are downloaded from The Josquin Research Project

The files experiment_metadata.csv and README.md have been handcrafted by the first author.

Notes (English)

Contribute

  • To report errors, e.g. in composition dates, please create an issue describing the error.
  • Contact me if you want to contribute data
  • Also do not hesitate to contact me for further questions: m.e.visscher@uu.nl

Cite

Finally, if you use the code in a research project, please reference it as:

Visscher, M., & Wiering, F. (2025). Fuzzy Frequencies: Finding Tonal Structures in Audio Recordings of Renaissance Polyphony. Heritage8(5), 164. https://doi.org/10.3390/heritage8050164

@article{visscher2025fuzzy,
  title     = {Fuzzy Frequencies: Finding Tonal Structures in Audio Recordings of Renaissance Polyphony},
  author    = {Visscher, M. and Wiering, F.},
  journal   = {Heritage},
  volume    = {8},
  number    = {5},
  pages     = {164},
  year      = {2025},
  doi       = {10.3390/heritage8050164},
  url       = {https://doi.org/10.3390/heritage8050164}
}



Files

experiment_metadata.csv

Files (12.3 GB)

Name Size Download all
md5:dace9c1d94767925494b9ea321a85165
6.0 GB Download
md5:3ca66b89fa879baef7bfd138e1b5c669
6.1 GB Download
md5:eb81b22e46cb9251f09fb59d6abba4f7
25.1 MB Download
md5:b7664d40e60e2379c47c2fc59b4726b2
285.0 kB Preview Download
md5:8cf29e4b6774f177af9ecb45057d0728
622.1 kB Download
md5:83682900e7a70fbe743b6f5df2fe9469
142.6 MB Download
md5:c9d3fa91f73ceb6286c55baa3a9ef2ac
12.9 kB Preview Download
md5:9209fcb0bd8cb1f52b77c4a369ee8cea
9.8 MB Download

Additional details

References

  • Bittner, R.M.; Bosch, J.J.; Rubinstein, D.; Meseguer-Brocal, G.; Ewert, S. A lightweight instrument-agnostic model for polyphonic note transcription and multipitch estimation. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Singapore, 2022.
  • Cuesta, H.; McFee, B.; Gómez, E. Multiple f0 estimation in vocal ensembles using convolutional neural networks. In Proceedings of the International Society for Music Information Retrieval (ISMIR), Montréal, Canada, 2020.
  • Gardner, J.P.; Simon, I.; Manilow, E.; Hawthorne, C.; Engel, J. MT3: Multi-task multitrack music transcription. In Proceedings of the International Conference on Learning Representations (ICLR), 2022.
  • Weiß, C.; Müller, M. From music scores to audio recordings: Deep pitch-class representations for measuring tonal structures. ACM Journal on Computing and Cultural Heritage 2024.