Audio Metaphor 2.0: An Improved Classification and Segmentation Pipeline for Generative Sound Design Systems

doi:10.5281/zenodo.6798205

Published June 7, 2022 | Version v2

Conference paper Open

Audio Metaphor 2.0: An Improved Classification and Segmentation Pipeline for Generative Sound Design Systems

1. University of British Columbia
2. Simon Fraser University

Soundscape composition and design is the creative practice of processing and combining sound recordings to evoke auditory associations and memories within a listener. We present a new set of classification and segmentation algorithms as part of Audio Metaphor (AUME), a generative system for creating novel soundscape compositions. Audio Metaphor processes natural language queries from a user to retrieve semantically linked sound recordings from a database containing 395,541 audio files. Building off previous work, we implemented a new audio feature extractor and conducted experiments to test the accuracy of the updated system. We then classified audio files based on general soundscape composition categories, improved emotion prediction, and refined our segmentation algorithm. The model maintains a good accuracy in segment classification, and we significantly improved valence and arousal prediction models - as noted by the r-squared (72.2% and 92.0%) and mean squared error values (0.09 and 0.03) in valence and arousal respectively. An empirical analysis, among other improvements, finds that the new system provides better segmentation results.

Files

51.pdf

Files (730.4 kB)

Name	Size	Download all
51.pdf md5:52bd48e5e56dec80ae700a36e6bda44c	730.4 kB	Preview Download

	All versions	This version
Views	196	56
Downloads	212	56
Data volume	156.6 MB	42.4 MB

Audio Metaphor 2.0: An Improved Classification and Segmentation Pipeline for Generative Sound Design Systems

Creators

Description

Files

51.pdf

Files (730.4 kB)