Published October 19, 2020 | Version v1
Conference paper Open

Automatic Music Transcription and Instrument Transposition with Differentiable Rendering

Description

Automatic music transcription aims to extract a musical score from a given audio signal. Conventional machine learning frameworks usually address this task by relying solely on error back-propagation from annotated MIDI data, without consideration for acoustic similarities. In this study, we complement the onset and frames prediction objective with an acoustic distance, through differentiable rendering of the estimated piano-roll and approximate reconstruction of the analyzed signal. We apply our method to piano and show that this added reconstruction error improves the performance achieved with the usual supervised transcription loss. Moreover, using solely this acoustic criterion allows fully unsupervised training and results outperforming classical techniques. Finally, our method also enables performing automatic instrument transposition by using audio samples of a different instrument from the original sound source when reconstructing the input signal.

Files

CSMC__MuMe_2020_paper_31.pdf

Files (1.8 MB)

Name Size Download all
md5:8c92436345ceda65805d65fd0cf08f00
1.8 MB Preview Download