848650
doi
10.5281/zenodo.848650
oai:zenodo.org:848650
user-multimodality
user-mir
user-computer-vision
user-mdm-dtic-upf
Arias, Pablo
ENS Cachan, Université Paris Saclay
Haro, Gloria
Universitat Pompeu Fabra
Gomez, Emilia
Universitat Pompeu Fabra
Visual music transcription of clarinet video recordings trained with audio-based labelled data
Zinemanas, Pablo
Universidad de la República
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
automatic music transcription
computer vision
deep learning
music information retrieval
multimodality
Department of Information and Communication Technologies, UPF, Barcelona
<p>Automatic transcription is a well-known task in the music information retrieval (MIR) domain, and consists on the computation of a symbolic music representation (e.g. MIDI) from an audio recording. In this work, we address the automatic transcription of video recordings when the audio modality is missing or it does not have enough quality, and thus analyze the visual information. We focus on the clarinet which is played by opening/closing a set of holes and keys. We propose a method for automatic visual note estimation by detecting the fingertips of the player and measuring their displacement with respect to the holes and keys of the clarinet. To this aim, we track the clarinet and determine its position on every frame. The relative positions of the fingertips are used as features of a machine learning algorithm trained for note pitch classification. For that purpose, a dataset is built in a semiautomatic way by estimating pitch information from audio signals in an existing collection of 4.5 hours of video recordings from six different songs performed by nine different players. Our results confirm the difficulty of performing visual vs audio automatic transcription mainly due to motion blur and occlusions that cannot be solved with a single view.</p>
Zenodo
2017-10-23
info:eu-repo/semantics/conferencePaper
848649
user-multimodality
user-mir
user-computer-vision
user-mdm-dtic-upf
1579538612.981879
1553217
md5:41c07f69ed46f44d6faac8ad165b6fa6
https://zenodo.org/records/848650/files/PID4967623.pdf
public
10.5281/zenodo.848649
isVersionOf
doi