Dataset Open Access

LatentJam Dataset

Philip Tovstogan; Xavier Serra; Dmitry Bogdanov

This LatentJam dataset contains 12 music latent representations (9 content-based and 3 collaborative-filtering) for 29 275 music tracks from Jamendo. It is released as part of the publication "Similarity of Nearest-Neighbor Query Results in Deep Latent Spaces". For more details on the individual models used for extraction, refer to the text of the paper.

The example code that uses this dataset and reproduces the experiments and analysis done in the paper is available at https://github.com/philtgun/compare-embeddings. See the README for more details on how to use this dataset and individual files.

The content-based representations have been extracted with the Essentia library (essentia-tensorflow 2.1b6.dev374) during the internship of the first author in the Jamendo in 2021 as part of the MIP-Frontiers project. The collaborative filtering representations have been computed from data provided by Jamendo with the Implicit library (AlternatingLeastSquares algorithm, implicit 0.4.4). The Python version used is 3.7.13.
This dataset is released under CC BY-NC-SA 4.0 License.

Please cite the publication if you use this dataset:

@inproceedings{tovstogan_similarity_2009,
	title = {Similarity of nearest-neighbor query results in deep latent spaces},
	author = {Tovstogan, Philip and Serra, Xavier and Bogdanov, Dmitry},
	booktitle = {Proceedings of the 19th Sound and Music Computing Conference ({SMC})},
	year = {2022}
}

 

Files (179.2 MB)
Name Size
cb-audioset-vggish-embeddings.npy
md5:f0c1d26be40d7bab1e4ffd48a85bbf52
15.0 MB Download
cb-msd-musicnn-embeddings.npy
md5:f55f3973e9f7e11c6660c270ecbce7bf
23.4 MB Download
cb-msd-musicnn-taggrams.npy
md5:2028ad0778581f1c7dc3941f8a4be400
5.9 MB Download
cb-msd-vgg-embeddings.npy
md5:b6ae6abaa7d62818e4a342cf407e13b7
30.0 MB Download
cb-msd-vgg-taggrams.npy
md5:ce6d9306fc25a11b24f875148062cf8a
5.9 MB Download
cb-mtat-musicnn-embeddings.npy
md5:b08f067fbea48d3c8eb89caacad1e4d4
23.4 MB Download
cb-mtat-musicnn-taggrams.npy
md5:c36186769f1cfa364203c7d78079c606
5.9 MB Download
cb-mtat-vgg-embeddings.npy
md5:29c64ade28c4ce4c2a77274d0a0c166b
30.0 MB Download
cb-mtat-vgg-taggrams.npy
md5:cb82032d39a84986a781ed62b47a2107
5.9 MB Download
cf-128.npy
md5:d70d4dfe103f93b631f98ab272ab1605
15.0 MB Download
cf-64.npy
md5:69b7c5925bb46ada5b25e767c9a70fd5
7.5 MB Download
cf-96.npy
md5:f0712038ca73c7cd5bf0a1f1412f75a2
11.2 MB Download
large-jamendo-ids.txt
md5:691b44e0154fc759c743112c83f26d44
226.9 kB Download
small-indices.txt
md5:cc188229131b7fb32067cc3dfb7ffd53
7.0 kB Download
small-jamendo-ids.txt
md5:5d6c58fc672458174a706146bca365f3
10.1 kB Download
46
16
views
downloads
All versions This version
Views 4646
Downloads 1616
Data volume 179.2 MB179.2 MB
Unique views 3434
Unique downloads 22

Share

Cite as