Dataset Open Access

Synth-Salience Choral Set

Helena Cuesta; Emilia Gómez

JSON-LD ( Export

  "description": "<p>The <strong>Synth-salience Choral Set</strong> (SSCS) is a publicly available dataset for voice assignment based on pitch salience.&nbsp;</p>\n\n<p>The dataset was created to support research on voice assignment based on pitch salience.&nbsp;By definition, an &ldquo;ideal&rdquo; pitch salience representation of a music recording is zero everywhere where there is no perceptible pitch, and has a positive value that reflects the pitches&rsquo; perceived energy at the frequency bins of the corresponding F0 values. In practice, for a normalized synthetic pitch salience function we assume a value equal to the maximum energy (salience), i. e., 1, in the time-frequency bins that correspond to the notes present in a song, and 0 elsewhere. We obtain such a synthetic pitch salience representation directly by processing the digital (MusicXML, MIDI) score of a music piece, using the desired time and frequency quantization, i. e., a time-frequency grid.&nbsp;</p>\n\n<p>To build the SSCS, we collect scores of four-part (SATB) a cappella choral music from the <a href=\"\">Choral Public Domain Library (CPDL)</a>&nbsp;using their API. We assemble a collection of <strong>5381 scores</strong> in MusicXML format, which we subsequently convert into MIDI files for an easier parsing.</p>\n\n<p><br>\nEach song in the dataset comprises five CSV files: one with the polyphonic pitch salience representation of the four voices (*_mix.csv) and four additional files with the monophonic pitch salience representation of each voice separately (*_S/A/T/B.csv). In both cases, the asterisk refers to the name of the song, which is shared between all representations from the same song.<br>\nBesides the pitch salience files, we provide a metadata CSV file (sscs_metadata.csv) which indicates the associated CPDL URL for each song in the dataset.&nbsp;Note that this dataset contains the input/output features used in the cited&nbsp;study, i.e., salience functions, and not audio files nor scores. However, the accompanying&nbsp;metadata file allows researchers to access the associated open access scores for each example in the dataset.</p>\n\n<p>When using this dataset for your research, please cite:</p>\n\n<p>Helena Cuesta and Emilia G&oacute;mez (2022).&nbsp;<strong>Voice Assignment in Vocal Quartets using Deep Learning Models based on Pitch Salience</strong>. Transactions of the International Society for Music Information Retrieval (TISMIR).&nbsp;<em>To appear.</em></p>\n\n<p>Helena Cuesta (2022). <strong>Data-driven Pitch Content Description of Choral Singing Recordings</strong>. PhD thesis. Universitat Pompeu Fabra, Barcelona.</p>\n\n<p>&nbsp;</p>", 
  "license": "", 
  "creator": [
      "affiliation": "Universitat Pompeu Fabra", 
      "@id": "", 
      "@type": "Person", 
      "name": "Helena Cuesta"
      "affiliation": "Joint Research Centre", 
      "@type": "Person", 
      "name": "Emilia G\u00f3mez"
  "url": "", 
  "datePublished": "2022-05-10", 
  "version": "1.0.0", 
  "@context": "", 
  "distribution": [
      "contentUrl": "", 
      "encodingFormat": "zip", 
      "@type": "DataDownload"
  "identifier": "", 
  "@id": "", 
  "@type": "Dataset", 
  "name": "Synth-Salience Choral Set"
All versions This version
Views 7575
Downloads 11
Data volume 2.3 GB2.3 GB
Unique views 6262
Unique downloads 11


Cite as