CANTO-JRP Dataset: Audio Pitch Extractions from the Josquin Research Project

Visscher, Mirjam; Wiering, Frans

doi:10.5281/zenodo.15352071

Published May 6, 2025 | Version v2

Dataset Open

CANTO-JRP Dataset: Audio Pitch Extractions from the Josquin Research Project

1. Utrecht University

The CANTO-JRP Dataset is based on compositions from the Josquin Research Project that were available on Spotify at the time this dataset was created. Due to copyright restrictions, the recordings are not publicly available. However, the dataset includes multiple f0 estimations (from various models), symbolic encodings, and metadata. CANTO-JRP is part of the CANTOSTREAM project. On Spotify, we created the CANTO-JRP playlist that is fully aligned with this dataset.

Please find a detailed description of the dataset in the README.md.

This dataset is described in our article:

Visscher, M., & Wiering, F. (2025). Fuzzy Frequencies: Finding Tonal Structures in Audio Recordings of Renaissance Polyphony. Heritage, 8(5), 164. https://doi.org/10.3390/heritage8050164

If you use this dataset for your reasearch please cite this paper.

Technical info (English)

General

This README.md file provides an overview of the CANTO-JRP Dataset, which is based on compositions from the Josquin Research Project, limited to those available on Spotify at the time the dataset was created. Due to copyright restrictions, the recordings themselves are not publicly available. Instead, the dataset includes multiple f0 estimations (from various models), symbolic encodings, and metadata.

Folder Structure

Each set of multiple f0 extractions is stored in its own folder. Due to their large size, these folders are compressed into separate tar.gz files. To use the data with the accompanying code from GitHub, download and extract the relevant folders into FuzzyFrequencies/data/raw/CANTO-JRP/. If you're only interested in the Multif0 extractions, you can download just the experiment_metadata.csv and the multif0.tar.gzip.

Folders and files in this dataset

195f
214c
basicpitch
MT3
multif0
symbolic
experiment_metadata.csv
README.md

Description of dataset items

This section provides an overview of the data in each folder, how the data should be used, and its purpose.

Folder	File types	Source	number of files	format
195f	Multipitch extractions, model 195f	Fuzzy Frequencies	637	CSV
214c	Multipitch extractions, model 214c	Fuzzy Frequencies	637	CSV
basicpitch	Basicpitch extractions	Fuzzy Frequencies	637	CSV
MT3	MT3 extractions	Fuzzy Frequencies	172	MIDI
multif0	Multif0 extractions	Fuzzy Frequencies	637	CSV
symbolic	Symbolic encodings	Josquin Research Project (JRP)	637	MusicXML

Note The number of MT3 extractions is lower than those from the other models. Due to the MT3 model’s lower performance on our dataset and the high computational cost, we only processed audio files smaller than approximately 10 MB.

File Types

Metadata

The file experiment_metadata.csv file contains information about each composition from the JRP that was available on Spotify at the time this dataset was created. This file serves both as a reference for users of the dataset and as a specification file for the GitHub code.

Field	Format	Description
id	integer	row identifier
nr_playlist	string	position(s) in the playlist
composer	string	composer's surname
composition	string	name of the composition
voices	integer	number of voices
experiment	string	experiment name, needed for the code
performer	string	performer(s) of the recording
Album	string	album name of the recording
year_recording	integer	year of recording
audio_final	integer	MIDI tone of the lowest note of the final chord
symbolic_is_audio	string	extent to which recording and encoding are the same (yes, almost, no)
instrumentation	string	instrumentation v(ocal) and (i)nstrumental
instrumentation_category	string	category of instrumentation (vocal, instrumental, mixed)
final_safe	string	extent to which the audio final is the same as the (transposed) encoded final (yes, no, pitch class profile)
not repeated	string	whether there is repetition of the encoding in the recording (yes, no)
repetitions	string	rough specification of the repetitions
comments	string	extra comments, mainly instruments used
symbolic	string	file name of the symbolic encoding
audio	string	file name of the recording
mf0	string	file name of the Multif0 extraction
basicpitch	string	file name of the Basicpitch extraction
multipitch_214c	string	file name of the Multipitch extraction, model 214c
multipitch_195f	string	file name of the Multipitch extraction, model 195f
MT3	string	file name of the MT3 extraction

Multipitch extraction

The Multipitch extractions include a column for each MIDI tone, with cell values representing the loudness of the pitch at a given timestamp.

Field	Format	Description
[empty]	integer	timestamp index, sample rate = 43.06640625
1	float	loudness of MIDI tone 1 + 24 = 25
..	..	..
71	float	loudness of MIDI tone 71 + 24 = 85

Basicpitch extraction

The Basicpitch extractions include a row for each detected note and its corresponding loudness.

Field	Format	Description
start_time_s	float	start time of the pitch in seconds
end_time_s	float	end time of the pitch in seconds
pitch_midi	integer	MIDI tone of the pitch
velocity	integer	MIDI equivalent of loudness
pitch_bend	inteher	multiple columns of microtonal pitch deviations

MT3 extraction

The MT3 extractions are provided in MIDI format (Musical Instrument Digital Interface). MIDI is an industry standard music technology protocol used to represent musical data and allow communication between musical devices. For more details, see the MIDI specifications.

Multif0 extraction

The Multif0 extractions do not have meaningful headers; the first column contains the timestamps, the subsequent columns contain 'voice' columns, without voice leading. By default, the leftmost voice column contains the lowest detected frequency.

Field	Format	Description
0.0	float	timestamp in seconds, time sample rate = 86.1328125
[empty]	float	frequency of the lowest voice at that time stamp, frequency sample rate = 20 cents
..	..	..
[empty]	float	frequency of the highest voice at that time stamp, frequency sample rate = 20 cents

Symbolic encoding

The symbolic encodings are provided in MusicXML format. For an introduction to this format, please see the MusicXML tutorial

Codebook

In this section, we specify for each file type how the data was collected or created.

For 611 out of the 902 works on the JRP website, usable recordings have been found on Spotify; these are collected in the Spotify playlist.

The Basicpitch extractions are created by applying the model by Bittner et al. (2022) [1] to the set of audio recordings.

The Multipitch extractions are created by applying the model by Weiß and Müller (2024) [4] to the set of audio recordings, with model 214c and 195f.

The MT3 extractions are created by extracting the audio files smaller than ~110 MB using the Colab notebook provided by Gardner et al (2022) [3].

The Multif0 extractions are created by applying the model by Cuesta et al. (2020) [2] to the audio files.

The symbolic encodings are downloaded from The Josquin Research Project

The files experiment_metadata.csv and README.md have been handcrafted by the first author.

Notes (English)

Contribute

To report errors, e.g. in composition dates, please create an issue describing the error.
Contact me if you want to contribute data
Also do not hesitate to contact me for further questions: m.e.visscher@uu.nl

Cite

Finally, if you use the code in a research project, please reference it as:

Visscher, M., & Wiering, F. (2025). Fuzzy Frequencies: Finding Tonal Structures in Audio Recordings of Renaissance Polyphony. Heritage, 8(5), 164. https://doi.org/10.3390/heritage8050164

@article{visscher2025fuzzy,
  title     = {Fuzzy Frequencies: Finding Tonal Structures in Audio Recordings of Renaissance Polyphony},
  author    = {Visscher, M. and Wiering, F.},
  journal   = {Heritage},
  volume    = {8},
  number    = {5},
  pages     = {164},
  year      = {2025},
  doi       = {10.3390/heritage8050164},
  url       = {https://doi.org/10.3390/heritage8050164}
}

Files

experiment_metadata.csv

Files (12.3 GB)

Name	Size	Download all
195f.tar.gz md5:dace9c1d94767925494b9ea321a85165	6.0 GB	Download
214c.tar.gz md5:3ca66b89fa879baef7bfd138e1b5c669	6.1 GB	Download
basicpitch.tar.gz md5:eb81b22e46cb9251f09fb59d6abba4f7	25.1 MB	Download
experiment_metadata.csv md5:b7664d40e60e2379c47c2fc59b4726b2	285.0 kB	Preview Download
MT3.tar.gz md5:8cf29e4b6774f177af9ecb45057d0728	622.1 kB	Download
multif0.tar.gz md5:83682900e7a70fbe743b6f5df2fe9469	142.6 MB	Download
README.md md5:c9d3fa91f73ceb6286c55baa3a9ef2ac	12.9 kB	Preview Download
symbolic.tar.gz md5:9209fcb0bd8cb1f52b77c4a369ee8cea	9.8 MB	Download

Additional details

Repository URL: https://github.com/MirjamVisscher/FuzzyFrequencies

Bittner, R.M.; Bosch, J.J.; Rubinstein, D.; Meseguer-Brocal, G.; Ewert, S. A lightweight instrument-agnostic model for polyphonic note transcription and multipitch estimation. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Singapore, 2022.
Cuesta, H.; McFee, B.; Gómez, E. Multiple f0 estimation in vocal ensembles using convolutional neural networks. In Proceedings of the International Society for Music Information Retrieval (ISMIR), Montréal, Canada, 2020.
Gardner, J.P.; Simon, I.; Manilow, E.; Hawthorne, C.; Engel, J. MT3: Multi-task multitrack music transcription. In Proceedings of the International Conference on Learning Representations (ICLR), 2022.
Weiß, C.; Müller, M. From music scores to audio recordings: Deep pitch-class representations for measuring tonal structures. ACM Journal on Computing and Cultural Heritage 2024.

	All versions	This version
Views	112	56
Downloads	394	274
Data volume	494.2 GB	364.4 GB

CANTO-JRP Dataset: Audio Pitch Extractions from the Josquin Research Project

Technical info (English)

General

Folder Structure

Folders and files in this dataset

Description of dataset items

File Types

Metadata

Multipitch extraction

Basicpitch extraction

MT3 extraction

Multif0 extraction

Symbolic encoding

Codebook

Notes (English)

Contribute

Cite

Files

experiment_metadata.csv

Files (12.3 GB)

Additional details

Software

References

CANTO-JRP Dataset: Audio Pitch Extractions from the Josquin Research Project

Creators

Description

Technical info (English)

General

Folder Structure

Folders and files in this dataset

Description of dataset items

File Types

Metadata

Multipitch extraction

Basicpitch extraction

MT3 extraction

Multif0 extraction

Symbolic encoding

Codebook

Notes (English)

Contribute

Cite

Files

experiment_metadata.csv

Files (12.3 GB)

Additional details

Software

References