Latent Evolutionary Signatures: A General Framework for Analyzing Music and Cultural Evolution

Gerstein, Mark; Warrell, Jonathan; Salichos, Leonidas; Gancz, Michael

doi:10.5281/zenodo.10642075

Published January 25, 2024 | Version v1

Software Open

Latent Evolutionary Signatures: A General Framework for Analyzing Music and Cultural Evolution

Latent Evolutionary Signatures: A General Framework for Analyzing Music and Cultural Evolution

Jonathan Warrell^a,b,1, Leonidas Salichos^a,b,e,1, Michael Gancz^c,1, Mark B. Gerstein^a,b,d

¹equal contribution

^a Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA.

^b Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA.

^c Department of Music, Yale University, New Haven, CT 06520, USA.

^d Department of Computer Science, Yale University, New Haven, CT 06520, USA.

^e Department of Biological and Chemical Sciences, New York Institute of Technology, New York, NY 10023, USA.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This repo contains optimal harmony and form+harmony based models for analyzing popular music as an evolutionary structure. Our model architecture is based on a traditional VAE, but with an energy-based prior that penalizes a measure of ‘evolutionary distance’, in this case informed by temporal distance across song release dates (see schematic below), in the latent space. The key output of each model is a set of latent ‘evolutionary signatures’, or characteristic distributions of chord/form k-mers that can be used to predict the date and genre of each song. We use the McGill Billboard corpus of popular song annotations as our database.

The subdirectory ‘harmonic_formal’ contains the code for training the model based both on chord progressions and formal features. The subdirectory ‘km4’ contains the code for training the model based on chord progressions of length 4.

In addition to the optimal models, we include some other configurations that we tested, including variants on the coarse-graining of formal units (models suffixed with ‘_A’, ‘_B’, and ‘_None’, referring to projection matrices A and B, where A retains formal categories that comprise the first 99% of the data, and collapses all others into a category of ‘other’; and B bins together semantically similar formal units), and different means of normalizing formal feature vector ‘X_struct,’ which in its raw form contains the counts of each formal category (models suffixed with ‘_binarized’ or ‘_zscore’ utilize these normalizations. These are contained within the ‘models’ folder of each subdirectory. Additional items of interest, including code for processing the McGill Billboard dataset, are included in the ‘supplemental’ folder. The ‘McGill-Billboard’ folder contains all raw data.

All code is currently written for MatLab.

Files

musevo.zip

Files (37.2 MB)

Name	Size	Download all
musevo.zip md5:c1a8e9db6022fdfb8f111fc9b646f6b0	37.2 MB	Preview Download

Additional details

Repository URL: https://github.com/gersteinlab/Musevo/tree/main
Programming language: MATLAB

	All versions	This version
Views	137	137
Downloads	31	31
Data volume	1.2 GB	1.2 GB

Latent Evolutionary Signatures: A General Framework for Analyzing Music and Cultural Evolution

Authors/Creators

Description

Files

musevo.zip

Files (37.2 MB)

Additional details

Software