Published November 13, 2023 | Version v1
Conference paper Open

DiffVel: Note-Level MIDI Velocity Estimation for Piano Performance by A Double Conditioned Diffusion Model


In any piano performance, expressiveness is paramount for effectively conveying the intent of the performer, and one of the most significant aspects of expressiveness is the loudness at the individual key or note level. However, accurately detecting note-level loudness poses a considerable technical challenge due to the polyphonic nature of piano performances, wherein multiple notes are played simultaneously, as well as the similarity of harmonic elements. MIDI velocity is crucial for indicating loudness in piano notes. This study conducted experiments for estimating a note-level MIDI velocity using a DiffRoll model: the Diffusion Model for piano transcription. By adopting double conditioning—audio and score infor­mation—and implementing noise removal as a post-processing, our findings highlight the model's potential in estimating MIDI velocity.



Files (1.3 MB)

Name Size Download all
1.3 MB Preview Download