Published January 14, 2023 | Version v1
Working paper Open

Photong: Generating 16-Bar Melodies from Images

  • 1. United World College of South East Asia
  • 2. School of Information Sciences, University of Illinois Urbana-Champaign

Description

This work aims to study the possibility of melody generation based on any arbitrary image using the power of deep-learning neural networks. We suggest a VAE-based pipeline that generates cohesive 16-bar MIDI melodies from images through emotion detection and modality transfer using feature embeddings. To implement this pipeline, we used an image encoder, a MIDI VAE and three bridging computer vision models. We then evaluate the system by examining the musical features of four distinct outputs to see how well they have captured the features of the input images.

Files

photong_paper.pdf

Files (6.3 MB)

Name Size Download all
md5:c319159f3a93fb001f22b1d623293774
6.3 MB Preview Download

Additional details

Related works

Is published in
Working paper: https://openreview.net/forum?id=UQY0bqcl_mX (URL)