Deep Embeddings and Section Fusion Improve Music Segmentation

doi:10.5281/zenodo.5624371

Published November 7, 2021 | Version v1

Conference paper Open

Deep Embeddings and Section Fusion Improve Music Segmentation

Music segmentation algorithms identify the structure of a music recording by automatically dividing it into sections and determining which sections repeat and when. Since the desired granularity of the sections may vary by application, multi-level segmentation produces several levels of segmentation ordered by granularity from one section (the whole song) up to N unique sections, and has proven to be a challenging MIR task. In this work we propose a multi-level segmentation method that leverages deep audio embeddings learned via other tasks. Our approach builds on an existing multi-level segmentation algorithm, replacing manually engineered features with deep embeddings learned through audio classification problems where data are abundant. Additionally, we propose a novel section fusion algorithm that leverages the multi-level segmentation to consolidate short segments at each level in a way that is consistent with the segmentations at lower levels. Through a series of experiments we show that replacing handcrafted features with deep embeddings can lead to significant improvements in multi-level music segmentation performance, and that section fusion further improves the results by cleaning up spurious short sections. We compare our approach to two strong baselines and show that it yields state-of-the-art results.

Files

000074.pdf

Files (388.0 kB)

Name	Size	Download all
000074.pdf md5:afcf19e3a4951c43be3f6c355fef6461	388.0 kB	Preview Download

225

Views

165

Downloads

Show more details

	All versions	This version
Views	225	225
Downloads	165	165
Data volume	69.5 MB	69.5 MB

More info on how stats are collected....

DOI

Resource type

Conference paper

Publisher

ISMIR

Imprint

Proceedings of the 22nd International Society for Music Information Retrieval Conference, 594-601. Online.

Conference

International Society for Music Information Retrieval Conference (ISMIR 2021) , Online, November 7-12, 2021

Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: October 30, 2021
Modified: July 17, 2024

Deep Embeddings and Section Fusion Improve Music Segmentation

Creators

Description

Files

000074.pdf

Files (388.0 kB)