Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published October 19, 2020 | Version v1
Conference paper Open

Modelling long- and short-term structure in symbolic music with attention and recurrence

Description

The automatic composition of music with long-term structure is a central problem in music generation. Neural network-based models have been shown to perform relatively well in melody generation, but generating music with long-term structure is still a major challenge. This paper introduces a new approach for music modelling that combines recent advancements of transformer models with recurrent networks – the long-short term universal transformer (LSTUT), and compare its ability to predict music against current state-of-the-art music models. Our experiments are designed to push the boundaries of music models on considerably long music sequences – a crucial requirement for learning long-term structure effectively. Results show that the LSTUT outperforms all the other models and can potentially learn features related to music structure at different time scales. Overall, we show the importance of integrating both recurrence and attention in the architecture of music models, and their potential use in future automatic composition systems.

Files

CSMC__MuMe_2020_paper_46.pdf

Files (2.0 MB)

Name Size Download all
md5:6a131a176465004be9c89276ed45dc04
2.0 MB Preview Download