Modelling long- and short-term structure in symbolic music with attention and recurrence
Description
The automatic composition of music with long-term structure is a central problem in music generation. Neural network-based models have been shown to perform relatively well in melody generation, but generating music with long-term structure is still a major challenge. This paper introduces a new approach for music modelling that combines recent advancements of transformer models with recurrent networks – the long-short term universal transformer (LSTUT), and compare its ability to predict music against current state-of-the-art music models. Our experiments are designed to push the boundaries of music models on considerably long music sequences – a crucial requirement for learning long-term structure effectively. Results show that the LSTUT outperforms all the other models and can potentially learn features related to music structure at different time scales. Overall, we show the importance of integrating both recurrence and attention in the architecture of music models, and their potential use in future automatic composition systems.
Files
CSMC__MuMe_2020_paper_46.pdf
Files
(2.0 MB)
Name | Size | Download all |
---|---|---|
md5:6a131a176465004be9c89276ed45dc04
|
2.0 MB | Preview Download |