Thesis Open Access

SATB Voice Segregation For Monoaural Recordings

Pétermann, Darius A,

Citation Style Language JSON Export

  "publisher": "Zenodo", 
  "DOI": "10.5281/zenodo.4091247", 
  "author": [
      "family": "P\u00e9termann, Darius A,"
  "issued": {
    "date-parts": [
  "abstract": "<p>Choral singing is a widely practiced form of ensemble singing wherein a group of people sing simultaneously in polyphonic harmony. The most commonly practiced&nbsp;setting for choir ensembles consists of four parts; Soprano, Alto, Tenor and Bass&nbsp;(SATB), each with its own range of fundamental frequencies (F0s). The task of&nbsp;source separation for this choral setting entails separating the SATB mixture into&nbsp;its constituent parts. Source separation for musical mixtures is well studied and&nbsp;many Deep Learning-based methodologies have been proposed for the same. However,<br>\nmost of the research has been focused on a typical case which consists in<br>\nseparating vocal, percussion and bass sources from a mixture, each of which has a&nbsp;distinct spectral structure. In contrast, the simultaneous and harmonic nature of&nbsp;ensemble singing leads to high structural similarity and overlap between the spectral&nbsp;components of the sources in a choral mixture, making source separation for&nbsp;choirs a harder task than the typical case. This, along with the lack of an appropriate&nbsp;consolidated dataset has led to a dearth of research in the field so far. In&nbsp;this work we first assess how well some of the recently developed methodologies for&nbsp;musical source separation perform for the case of SATB choirs. We then propose a&nbsp;novel domain-specific adaptation for conditioning the recently proposed U-Net architecture<br>\nfor musical source separation using the fundamental frequency contour of<br>\neach of the singing groups and demonstrate that our proposed approach surpasses&nbsp;results from domain-agnostic architectures. Lastly we assess our approach using&nbsp;different evaluation methodologies, going from objective to subjective-based ones,&nbsp;and provide a comparative analysis of the various results.</p>", 
  "title": "SATB Voice Segregation For Monoaural Recordings", 
  "type": "thesis", 
  "id": "4091247"
All versions This version
Views 237237
Downloads 169169
Data volume 2.0 GB2.0 GB
Unique views 203203
Unique downloads 146146


Cite as