00000nam##2200000uu#4500 4091469 doi 10.5281/zenodo.4091469 oai:zenodo.org:4091469 user-smc-master user-mtgupf Jordà, Sergi Universitat Pompeu Fabra Haki, Behzad Universitat Pompeu Fabra Dualization Of Rhythm Patterns Kotowski, Błażej Universitat Pompeu Fabra info:eu-repo/semantics/openAccess Creative Commons Attribution 3.0 Unported https://creativecommons.org/licenses/by/3.0/legalcode cc-by-3.0 spdx rhythm dualization; dimensionality reduction; rhythm analysis; sequence to sequence; autoencoder; LSTM; latent model; <p>This dissertation is a summary of the research on the task of the dualization of rhythm patterns. Rhythm pattern dualization is a transformation of a multi-instrumental rhythm pattern to another pattern composed of maximum two instruments while maintaining coherence and the perceptual essence of the original rhythm. It is a novel task, so comprehensive literature research marrying many disciplines is conducted first. The problem is approached in a multidisciplinary way. Drawing from neurology, cognitive science, and psychology, we assemble solid foundations for tackling the task. We propose two machine learning models built upon the, recently reported by Google Magenta, GrooVAE model for rhythm humanization [34]. The GrooVAE network topology is a combination of Sequence To Sequence Learning and Variational Autoencoder architectures. We treat the task of dualization as a variation of the dimensionality reduction problem. Thus, we intend to achieve the dualized version of rhythm by modifying the network’s architecture in a way, that creates a reduced intermediary representation in the process of autoencoding. We propose two models achieving rhythm compression in different ways. In the first, Autoencoders model, we first reduce the dimensionality of the original GrooVAE network, next we collect hidden state vector values from the first layer of the decoding network. Then, we train a cluster of autoencoders to find a latent, two-dimensional representation of these h-vectors, which we treat as a dualized version of the input pattern. In the second, Bottleneck model, we create a two-dimensional bottleneck layer in between the two original layers of the decoder network. We treat this two-dimensional bottleneck representation as a dualized version of the input pattern. Finally, we evaluate our models with listening experiments and report the results.  </p> Zenodo 2020-09-15 Universitat Pompeu Fabra user-smc-master user-mtgupf info:eu-repo/semantics/doctoralThesis 20220825135537.0 2884436 md5:25a95f36f07804a62cbadff406304f4f https://zenodo.org/records/4091469/files/2020-Blazej-Kotowski.pdf open 10.5281/zenodo.4091468 isVersionOf doi