Published December 9, 2024 | Version 1.3.1
Dataset Open

Cadenza Challenge (CAD2): databases for rebalancing classical music task

  • 1. ROR icon University of Salford
  • 1. ROR icon University of Sheffield
  • 2. ROR icon University of Salford
  • 3. ROR icon University of Nottingham
  • 4. ROR icon University of Leeds

Description

Cadenza

Please, cite CadenzaWoodwind as

Gerardo Roa-Dabike , Trevor J. Cox , Alex J. Miller , Bruno M. Fazenda , Simone Graetzer , Rebecca R. Vos , Michael A. Akeroyd , Jennifer Firth , William M. Whitmer , Scott Bannister , Alinka Greasley , Jon P. Barker , The Cadenza Woodwind Dataset: Synthesised Quartets for Music Information Retrieval and Machine Learning, Data in Brief (2024), doi: https://doi.org/10.1016/j.dib.2024.111199

This is the training and validation data for the rebalancing classic music task from the Second Cadenza Machine Learning Challenge (CAD2).

The Cadenza Challenges are improving music production and processing for people with a hearing loss. According to The World Health Organization, 430 million people worldwide have a disabling hearing loss. Hearing aid users report several issues when listening to music, including distortion in the bass, difficulties in perceiving the full range of the music, especially high-frequency pitches, and a tendency to miss the impact of quieter parts of compositions [1]. In a pilot study, we found giving listeners sliders to allow them to rebalance different instruments in a classical music ensemble was desirable.

Overview of files:

  1. CadenzaWoodwind. Synthesized dataset of small ensembles of woodwind instruments for training and validation.
  2. EnsembleSet_Mix_1. A subset of the synthesised EnsembleSet [7] for training and validation (Mix_1 render).
  3. Real Data for Tuning: Stereo_Reverb_Real_Data_For_Tuning.zip.
  4. metadata.zip contains audiograms, scene details, target gains and compressor settings.

The audio files are in FLAC format in the .zip archives. The json files contain metadata.

More details below.

 

Methods

1) CadenzaWoodwind

This is a dataset of five woodwind instruments (flute, clarinet, oboe, alto saxophone and bassoon) created for two quartet orchestrations: (a) flute, clarinet, oboe, and bassoon and (b) flute, alto saxophone, oboe and bassoon. The stems for each solo instrument are presented along with the two mixtures.

The scores for the dataset came from the OpenScore String Quartet Corpus [6]. 21 of these were selected at random. The dice was weighted according to the length of the pieces so that longer scores with multiple movements were more likely to be chosen. This was done because there were manual steps in the subsequent rendering process, and having many small scores would make the process much longer. A maximum of two pieces per composer were chosen. Two of the scores would not render in the music notation software and so were excluded. The 19 scores selected are shown in CadenzaWoodwind_openscores_selected.txt.

The scores were interpretted using Dorico music notation software. The four string parts were allocated to flute, oboe, clarinet and bassoon and a professional sample library used to create the audio. Reverberation was added using a convolution reverb, using an impulse response from the Royal Tropical Institute, Amsterdam in Avid's Space Impulse Response IR Library.

2) EnsembleSet_Mix_1

A subset of the synthesised EnsembleSet [7] for Mix_1 stereo version provided to reduce download size. Please see the EnsembleSet page and paper for full details about methods.

3) Real Data for Tuning

Stereo_Reverb_Real_Data_For_Tuning.zip is a small sample of real recordings that are intended to help entrants deal with the mismatch between the synthetic training data and the real recorded evaluation set. It was generated from:

Both databases have mono recordings of isolated instruments in anechoic conditions. We have taken these and created stereo versions in small halls using convolution reverb. This used ambisonic b-format impulse responses from the Openair Database [4].

Four small to medium sized venue impulse responses were chosen: Arthur Sykes Rymer Auditorium, University of York; Central Hall, University of York; The Dixon Studio Theatre, University of York, and York Guildhall Council Chamber. Only impulse responses measured with a source-receiver distance greater than 5m were included.

The following procesure was used to convert the b-format to stereo. For each instrument.

  • The b-format representation was rotated in azimuth to face the instrument being rendered. The instruments were spaced at 10 degrees apart.
  • The b-format was convered to mid-side stereo using Eqn 1.7 from reference [5] (alpha=0.5).
  • The mid-side stereo impulse responses were convolved with the anechoic mono music recordings.
  • The mix was created by summing all the audio for the instruments in the ensemble.
  • The mix was then normalised so the peak absolute sample value was 1. The scaling factor used to do this was then applied to all the solo instruments audio tracks.

The readme.txt files in each folder give the filename of the impulse response used.

code_to_create_urmp_bach10_stereo_with_reverb.zip contains the MATLAB code used to create the dataset.

Notes

Release Notes

Version V 1.3.1

This version replace package cadenza_cad2_task2_eval.v1_0.tar.gz with cadenza_cad2_task2_eval.v1_1.tar.gz with minor corrections.

Version V 1.3.0

This update includes all files from V 1.2.0 plus the cadenza_cad2_task2_eval.v1_0.tar.gz package, which corresponds to the evaluation data to be processed by the challenge participants.

Version V 1.2.0

This update enhances the package Stereo_Reverb_Real_Data_For_Tuning.zip.  Now, it includes essential metadata required to run the recipe. The included metadata files are as follows:

  • listeners.valid.json -> Contains the same listeners as in the validation set.
  • compressor_params.valid.json -> Contains the same compressor parameters as in the validation set.
  • gains.json -> Provides the same list of gains as found in the metadata.zip file.
  • music_tracks.eval_sample.json -> A selection of 8 tracks from the URMP and BACH10 datasets (note that these tracks are not part of the evaluation set).
  • music.eval_sample.json -> Details of 38 15-second audio segments for processing.
  • scenes.eval_sample.json -> A set of 152 scenes (derived from the 38 samples, each with 4 different gains).
  • scene_listeners.eval_sample.json -> Lists the listeners assigned to each scene.

We hope this sample dataset will assist users in refining and improving their systems.

Version V 1.1.0

  1. This version correct issues in the training metadata for the challenge. Especifically, 4 tracks for the EnsembleSet subset.
    • concerto-in-d-minor: Tracks has 4 violins. However, according to challenge rules, mixtures can only have 2 voices of the same instruments, Violin_3 and Violin_4 were excluded.
    • hallelujah: Originally it has 6 instruments but challenge rules limit these to 5. 5 instruments were randomly selected, discarding Cello
    • Semiramide_riconosciuta: Originally it has 6 instruments but challenge rules limit these to 5. 5 instruments were randomly selected, discarding Viola
    • AlTardarDellaVendetta: Violin_1 is now included, totalling 4 instruments in the mixture. Violin_1 was omited before as it is spell as VIolin_1 and not Violin_1 in EnsembleSet.
  2. The mixtures tracks in EnsembleSet_Mix_1.zip package now reflect the correct instruments.

The rest of the files remain the same as version V 1.0.0

Files

CadenzaWoodwind.json

Files (12.9 GB)

Name Size Download all
md5:54268aa8dc08035cd012483bba701d09
93.6 MB Download
md5:710269bb4ef87db48aa335be98f33af8
19.7 kB Preview Download
md5:cdc08be8482662c5023b5d4060d1b064
7.5 GB Preview Download
md5:f070249d983202d5a69cca4ab90f21cb
1.4 kB Preview Download
md5:ed0d73e3e799c9cdbb70fdf8101cd9f1
4.4 kB Preview Download
md5:95338a18cfd4f6c4f13eed7dc01bb611
44.2 kB Preview Download
md5:5bdcd0154da69e7516aef2d3062a47f0
5.1 GB Preview Download
md5:7ff49002ae62023cd547ff057cbba461
211.9 kB Preview Download
md5:7045f9aea3c9ae9c7426edd6ef4c82ba
182.9 MB Preview Download

Additional details

Funding

UK Research and Innovation
EnhanceMusic: Machine Learning Challenges to Revolutionise Music Listening for People with Hearing Loss EP/W019434/1

Dates

Created
2024-07-05

Software

Repository URL
https://github.com/claritychallenge/clarity
Programming language
Python
Development Status
Active

References

  • [1] A. Greasley, H. Crook, and R. Fulford, "Music listening and hearing aids: perspectives from audiologists and their patients," International Journal of Audiology, vol. 59, no. 9, pp. 694–706, 2020.
  • [2] Zhiyao Duan and Bryan Pardo, "Soundprism: an online system for score-informed source separation of music audio," IEEE Journal of Selected Topics in Signal Process., vol. 5, no. 6, pp. 1205-1215, 2011.
  • [3] Bochen Li *, Xinzhao Liu *, Karthik Dinesh, Zhiyao Duan, Gaurav Sharma, "Creating a multi-track classical music performance dataset for multi-modal music analysis: Challenges, insights, and applications", IEEE Transactions on Multimedia, 2018. (* equal contribution).
  • [4]  Murphy, D.T. and Shelley, S., 2010, November. Openair: An interactive auralization web resource and database. In Audio Engineering Society Convention 129. Audio Engineering Society.
  • [5] Zotter, F. and Frank, M., 2019. Ambisonics: A practical 3D audio theory for recording, studio production, sound reinforcement, and virtual reality (p. 5). Springer Nature.
  • [6] Gotham, M., Redbond, M., Bower, B. and Jonas, P., 2023, November. The "OpenScore String Quartet" Corpus. In Proceedings of the 10th International Conference on Digital Libraries for Musicology (pp. 49-57).
  • [7] Sarkar, S., Benetos, E. and Sandler, M., 2022. Ensembleset: A new high-quality synthesised dataset for chamber ensemble separation.