Investigating U-Nets with various intermediate blocks for spectrogram-based singing voice separation

Woosung Choi; Minseok Kim; Jaehwa Chung; Daewon Lee; Soonyoung Jung

doi:10.5281/zenodo.4245404

Published October 11, 2020 | Version v1

Conference paper Open

Investigating U-Nets with various intermediate blocks for spectrogram-based singing voice separation

Singing Voice Separation (SVS) tries to separate singing voice from a given mixed musical signal. Recently, many U-Net-based models have been proposed for the SVS task, but there were no existing works that evaluate and compare various types of intermediate blocks that can be used in the U-Net architecture. In this paper, we introduce a variety of intermediate spectrogram transformation blocks. We implement U-nets based on these blocks and train them on complex-valued spectrograms to consider both magnitude and phase. These networks are then compared on the SDR metric. When using a particular block composed of convolutional and fully-connected layers, it achieves state-of-the-art SDR on the MUSDB singing voice separation task by a large margin of 0.9 dB. Our code and models are available online.

Files

46.pdf

Files (503.0 kB)

Name	Size	Download all
46.pdf md5:7a16d7947d2376d97268c0545ece243d	503.0 kB	Preview Download

170

Views

Downloads

Show more details

	All versions	This version
Views	170	170
Downloads	96	96
Data volume	49.8 MB	49.8 MB

More info on how stats are collected....

DOI

Resource type

Conference paper

Publisher

ISMIR

Imprint

Proceedings of the 21st International Society for Music Information Retrieval Conference, 192-198. Montreal, Canada.

Conference

International Society for Music Information Retrieval Conference (ISMIR 2020) , Montreal, Canada, October 11-16, 2020

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: November 5, 2020
Modified: July 19, 2024

Investigating U-Nets with various intermediate blocks for spectrogram-based singing voice separation

Authors/Creators

Description

Files

46.pdf

Files (503.0 kB)