There is a newer version of this record available.

Conference paper Open Access

Audio Metaphor 2.0: An Improved Classification and Segmentation Pipeline for Generative Sound Design Systems

Kranabetter, Joshua; Carpenter, Craig; Tchemeube, Renaud Bougueng; Pasquier, Philippe; Thorogood, Miles

Soundscape composition and design is the creative practice of processing and combining sound recordings to evoke auditory associations and memories within a listener. We present a new set of classification and segmentation algorithms as part of Audio Metaphor (AUME), a generative system for creating novel soundscape compositions. Audio Metaphor processes natural language queries from a user to retrieve semantically linked sound recordings from a database containing 395,541 audio files. Building off previous work, we implemented a new audio feature extractor and conducted experiments to test the accuracy of the updated system. We then classified audio files based on general soundscape composition categories, improved emotion prediction, and refined our segmentation algorithm. The model maintains a good accuracy in segment classification, and we significantly improved valence and arousal prediction models - as noted by the r-squared (72.2% and 92.0%) and mean squared error values (0.09 and 0.03) in valence and arousal respectively. An empirical analysis, among other improvements, finds that the new system provides better segmentation results.

Files (692.1 kB)
Name Size
692.1 kB Download
All versions This version
Views 10596
Downloads 8576
Data volume 59.2 MB52.6 MB
Unique views 9385
Unique downloads 7870


Cite as