Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published October 11, 2020 | Version v1
Conference paper Open

Modeling perception with hierarchical prediction: Auditory segmentation with deep predictive coding locates candidate evoked potentials in EEG

Description

The human response to music combines low-level expectations that are driven by the perceptual characteristics of audio with high-level expectations from the context and the listener's expertise. This paper discusses surprisal based music representation learning with a hierarchical predictive neural network. In order to inspect the cognitive validity of the network's predictions along their time-scales, we use the network's prediction error to segment electroencephalograms (EEG) based on the audio signal. Using the NMED-T dataset on passive natural music listening we explore the automatic segmentation of audio and EEG into events using the suggested model. By averaging only the EEG signal at predicted locations, we were able to visualize auditory evoked potentials connected to local and global musical structures. This indicates the potential of unsupervised predictive learning with deep neural networks as means to retrieve musical structure from audio and as a basis to uncover the corresponding cognitive processes in the human brain.

Files

219.pdf

Files (2.1 MB)

Name Size Download all
md5:ace3a07d8c3ee04833270c1129c51a54
2.1 MB Preview Download