Audio Metaphor 2.0: An Improved Classification and Segmentation Pipeline for Generative Sound Design Systems

doi:10.5281/zenodo.6573411

Published June 7, 2022 | Version v1

Conference paper Open

Audio Metaphor 2.0: An Improved Classification and Segmentation Pipeline for Generative Sound Design Systems

1. University of British Columbia
2. Simon Fraser University

Soundscape composition and design is the creative practice of processing and combining sound recordings to evoke auditory associations and memories within a listener. We present a new set of classification and segmentation algorithms as part of Audio Metaphor (AUME), a generative system for creating novel soundscape compositions. Audio Metaphor processes natural language queries from a user to retrieve semantically linked sound recordings from a database containing 395,541 audio files. Building off previous work, we implemented a new audio feature extractor and conducted experiments to test the accuracy of the updated system. We then classified audio files based on general soundscape composition categories, improved emotion prediction, and refined our segmentation algorithm. The model maintains a good accuracy in segment classification, and we significantly improved valence and arousal prediction models - as noted by the r-squared (72.2% and 92.0%) and mean squared error values (0.09 and 0.03) in valence and arousal respectively. An empirical analysis, among other improvements, finds that the new system provides better segmentation results.

Files

51.pdf

Files (692.1 kB)

Name	Size	Download all
51.pdf md5:e0c56b67f3fdf37a6bef1a55163d775f	692.1 kB	Preview Download

	All versions	This version
Views	194	137
Downloads	212	152
Data volume	156.6 MB	111.4 MB

Audio Metaphor 2.0: An Improved Classification and Segmentation Pipeline for Generative Sound Design Systems

Creators

Description

Files

51.pdf

Files (692.1 kB)