Published June 1, 2013
| Version v1
Conference paper
Open
Cross-modal Sound Mapping Using Deep Learning
Creators
Description
We present a method for automatic feature extraction and cross-modal mappingusing deep learning. Our system uses stacked autoencoders to learn a layeredfeature representation of the data. Feature vectors from two (or more)different domains are mapped to each other, effectively creating a cross-modalmapping. Our system can either run fully unsupervised, or it can use high-levellabeling to fine-tune the mapping according a user's needs. We show severalapplications for our method, mapping sound to or from images or gestures. Weevaluate system performance both in standalone inference tasks and incross-modal mappings.
Files
nime2013_111.pdf
Files
(546.8 kB)
Name | Size | Download all |
---|---|---|
md5:d67fc5b70d2ce375f525da84c5a4a37a
|
546.8 kB | Preview Download |