Clustering Expressive Timing in Performed Classical Piano Music with VQ-VAE
Authors/Creators
Description
There are many attempts on clustering expressive timing in performed classical piano music which suffer from variable lengths of phrases. This work uses VQ-VAE, a deep learning based method, to cluster expressive timing. The proposed method uses a codebook with a codec structure, where each code vector corresponds to a cluster. The code vectors that is very similar could be further merged, which gives a more flexible way to determine the number of clusters for expressive timing. To evaluate the proposed method, a model selection test with Gaussian Mixture Model (GMM) for expressive timing is repeated to compare the optimal number of clusters in expressive timing. The JS divergence between clusters resulted by both VQ-VAE and GMM is also tested to show the difference of cluster distribution. The result shows that the number of clusters produced by VQ-VAE is supported by model selection test with GMMs. The distribution difference of expressive timing clusters between VQ-VAE and GMMs are acceptable.
Files
CMMR2025_P3_14.pdf
Files
(2.1 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:2b2b3796fe63cd4cd7886f8b84ecfa5c
|
2.1 MB | Preview Download |