Insights into Transfer Learning between Image and Audio Music Transcription

doi:10.5281/zenodo.6573248

Published June 7, 2022 | Version v1

Conference paper Open

Insights into Transfer Learning between Image and Audio Music Transcription

1. University of Alicante

Optical Music Recognition (OMR) and Automatic Music Transcription (AMT) stand for the research fields that devise methods to transcribe music sources---documents or audio signals, respectively---into a structured digital format. Historically, they have followed different approaches to achieve the same goal. However, their recent definition in terms of sequence labeling tasks gathers them under a common formulation framework. Under this premise, one may wonder if there exist any synergies between the two fields that could be exploited to improve the individual recognition rates in their respective domains. In this work, we aim to further explore this question from a Transfer Learning (TL) point of view in the context of neural end-to-end recognition models. More precisely, we consider a music transcription system, trained on either image or audio data, and adapt its performance to the unseen domain during the training phase using different TL schemes. Results show that knowledge transfer slightly boosts model performance with sufficient available data, but it is not properly leveraged when the latter condition is not met. This opens up a new promising, yet challenging, research path towards building an effective bridge between two solutions of the same problem.

Files

38.pdf

Files (553.7 kB)

Name	Size	Download all
38.pdf md5:c8d071785dcbfde0655cb762f019970c	553.7 kB	Preview Download

	All versions	This version
Views	480	393
Downloads	173	107
Data volume	106.6 MB	63.1 MB

Insights into Transfer Learning between Image and Audio Music Transcription

Creators

Description

Files

38.pdf

Files (553.7 kB)