A Fine Tuning Strategy to Improve Musical Source Separation Quality for Indian Carnatic Music
Authors/Creators
Contributors
Supervisor (3):
- 1. Universitat Pompeu Fabra
Description
The computational analysis of Carnatic music from audio remains a field of high research interest due to the genre’s rich melodic and rhythmic complexity. However,
despite the availability of large multitrack collections such as Saraga, the liverecorded nature of this repertoire leads to a scarcity of truly clean instrument and
vocal stems, posing significant challenges for both musicological and technological studies. State-of-the-art music source separation (MSS) models perform poorly on
Carnatic music due to a pronounced domain mismatch with their training data.
This work proposes a fine-tuning strategy for improving separation of vocals, mridangam, and violin plus tanpura stems in Carnatic music. The approach uses a
Sparse Compression U-Net (SCNet) pretrained on MusDB18, extended with a curated training set combining clean Carnatic multitrack recordings and out-of-domain
data. To further reduce the domain gap, three data augmentations are introduced: (i) violin sampling augmentation, (ii) microphone-bleeding simulation, and (iii) room impulse response convolution.
The proposed model achieves substantial SDR improvements over the baselines on a clean Carnatic benchmark derived from the Sanidha dataset, and a perceptual
evaluation on Saraga confirms significant quality gains on all 3 separated sources. On the benchmark, the best configuration outperforms all baselines by a large margin
in SDR, while training in under two days on a single 40GB GPU - making it considerably less resource-exhaustive than many similar deep learning-based MSS domain adaptation methods.
All pretrained models, code, a cleaned version of the Saraga dataset, and the Sanidha benchmark are released alongside this work.
Files
Serafin-Schweinitz_SMS_2025_Master_Thesis.pdf
Files
(17.7 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:8896ef679f28ba9ca54ec9d989ca923f
|
17.7 MB | Preview Download |
Additional details
Dates
- Accepted
-
2025-10-09