QuartSet: A String Quartet Dataset for Transcription and Source Separation of Real Instrument Recordings

Rumbold, Erika; Tzanetakis, George

doi:10.5281/zenodo.17497288

Published November 3, 2025 | Version v1

Conference paper Open

QuartSet: A String Quartet Dataset for Transcription and Source Separation of Real Instrument Recordings

With the state of the art for many MIR tasks being based in deep learning, it is crucial that there be a large amount of data to train these models. Many approaches opt for audio that has been synthesized from MIDI data, but these often lack the musicality and nuances of recorded musicians. We present QuartSet, a dataset of real-instrument recordings and their accompanying scores for the tasks of audio-to-score transcription and score-informed source separation. The scores of QuartSet are provided in Kern notation, which is a simple format that is easily manipulable. Instead of being simplified, as is the common practice for transcription works, the Kern scores contain every dynamic, articulation, and ornament marking as written by the original composers. This is so the scores fully reflect what is present in the audio recordings. We show that QuartSet can be easily integrated with MIDI-synthesized data to create a larger, more diverse dataset to train a transcription model. When trained on both synthesized and real audio data, the model was able to produce better transcriptions of other real audio than the model that was trained only on synthesized data. Finally, we suggest potential methods for creating more audio and score data without synthesis.

Files

CMMR2025_P1_14.pdf

Files (1.2 MB)

Name	Size	Download all
CMMR2025_P1_14.pdf md5:638c239cd18f423b2d9750cf44b46125	1.2 MB	Preview Download

299

Views

112

Downloads

Show more details

	All versions	This version
Views	299	299
Downloads	112	112
Data volume	146.6 MB	146.6 MB

More info on how stats are collected....

DOI

Resource type

Conference paper

Publisher

Zenodo

Imprint

Proceedings of the 17th International Symposium on Computer Music Multidisciplinary Research, 582-595. London, United Kingdom. ISBN: 979-10-97498-06-1.

Conference

17th International Symposium on Computer Music Multidisciplinary Research (CMMR 2025) , London, United Kingdom, 3-7 November 2025

Languages

English

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: November 3, 2025
Modified: November 3, 2025

QuartSet: A String Quartet Dataset for Transcription and Source Separation of Real Instrument Recordings

Authors/Creators

Description

Files

CMMR2025_P1_14.pdf

Files (1.2 MB)