Dataset Open Access

Synth-Salience Choral Set

Helena Cuesta; Emilia Gómez

MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="">
  <controlfield tag="005">20220514015013.0</controlfield>
  <controlfield tag="001">6534429</controlfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Joint Research Centre</subfield>
    <subfield code="a">Emilia Gómez</subfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">2278328229</subfield>
    <subfield code="z">md5:6b83cef701b3c4703af741b55618569a</subfield>
    <subfield code="u"></subfield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2022-05-10</subfield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire_data</subfield>
    <subfield code="o"></subfield>
  <datafield tag="909" ind1="C" ind2="4">
    <subfield code="p">Transactions of the International Society for Music Information Retrieval (TISMIR)</subfield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">Universitat Pompeu Fabra</subfield>
    <subfield code="0">(orcid)0000-0001-8531-4487</subfield>
    <subfield code="a">Helena Cuesta</subfield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Synth-Salience Choral Set</subfield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u"></subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2"></subfield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;The &lt;strong&gt;Synth-salience Choral Set&lt;/strong&gt; (SSCS) is a publicly available dataset for voice assignment based on pitch salience.&amp;nbsp;&lt;/p&gt;

&lt;p&gt;The dataset was created to support research on voice assignment based on pitch salience.&amp;nbsp;By definition, an &amp;ldquo;ideal&amp;rdquo; pitch salience representation of a music recording is zero everywhere where there is no perceptible pitch, and has a positive value that reflects the pitches&amp;rsquo; perceived energy at the frequency bins of the corresponding F0 values. In practice, for a normalized synthetic pitch salience function we assume a value equal to the maximum energy (salience), i. e., 1, in the time-frequency bins that correspond to the notes present in a song, and 0 elsewhere. We obtain such a synthetic pitch salience representation directly by processing the digital (MusicXML, MIDI) score of a music piece, using the desired time and frequency quantization, i. e., a time-frequency grid.&amp;nbsp;&lt;/p&gt;

&lt;p&gt;To build the SSCS, we collect scores of four-part (SATB) a cappella choral music from the &lt;a href=""&gt;Choral Public Domain Library (CPDL)&lt;/a&gt;&amp;nbsp;using their API. We assemble a collection of &lt;strong&gt;5381 scores&lt;/strong&gt; in MusicXML format, which we subsequently convert into MIDI files for an easier parsing.&lt;/p&gt;

Each song in the dataset comprises five CSV files: one with the polyphonic pitch salience representation of the four voices (*_mix.csv) and four additional files with the monophonic pitch salience representation of each voice separately (*_S/A/T/B.csv). In both cases, the asterisk refers to the name of the song, which is shared between all representations from the same song.&lt;br&gt;
Besides the pitch salience files, we provide a metadata CSV file (sscs_metadata.csv) which indicates the associated CPDL URL for each song in the dataset.&amp;nbsp;Note that this dataset contains the input/output features used in the cited&amp;nbsp;study, i.e., salience functions, and not audio files nor scores. However, the accompanying&amp;nbsp;metadata file allows researchers to access the associated open access scores for each example in the dataset.&lt;/p&gt;

&lt;p&gt;When using this dataset for your research, please cite:&lt;/p&gt;

&lt;p&gt;Helena Cuesta and Emilia G&amp;oacute;mez (2022).&amp;nbsp;&lt;strong&gt;Voice Assignment in Vocal Quartets using Deep Learning Models based on Pitch Salience&lt;/strong&gt;. Transactions of the International Society for Music Information Retrieval (TISMIR).&amp;nbsp;&lt;em&gt;To appear.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Helena Cuesta (2022). &lt;strong&gt;Data-driven Pitch Content Description of Choral Singing Recordings&lt;/strong&gt;. PhD thesis. Universitat Pompeu Fabra, Barcelona.&lt;/p&gt;

  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.6534428</subfield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.6534429</subfield>
    <subfield code="2">doi</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>
All versions This version
Views 7575
Downloads 11
Data volume 2.3 GB2.3 GB
Unique views 6262
Unique downloads 11


Cite as