Published November 3, 2025 | Version v1
Conference paper Open

Clustering Expressive Timing in Performed Classical Piano Music with VQ-VAE

Authors/Creators

Description

There are many attempts on clustering expressive timing in performed classical piano music which suffer from variable lengths of phrases. This work uses VQ-VAE, a deep learning based method, to cluster expressive timing. The proposed method uses a codebook with a codec structure, where each code vector corresponds to a cluster. The code vectors that is very similar could be further merged, which gives a more flexible way to determine the number of clusters for expressive timing. To evaluate the proposed method, a model selection test with Gaussian Mixture Model (GMM) for expressive timing is repeated to compare the optimal number of clusters in expressive timing. The JS divergence between clusters resulted by both VQ-VAE and GMM is also tested to show the difference of cluster distribution. The result shows that the number of clusters produced by VQ-VAE is supported by model selection test with GMMs. The distribution difference of expressive timing clusters between VQ-VAE and GMMs are acceptable.  

Files

CMMR2025_P3_14.pdf

Files (2.1 MB)

Name Size Download all
md5:2b2b3796fe63cd4cd7886f8b84ecfa5c
2.1 MB Preview Download