Deep HRTF Encoding & Interpolation: Exploring Spatial Correlations using Convolutional Neural Networks

Zurale, Devansh; Yadegari, Shahrokh; Dubnov, Shlomo

doi:10.5281/zenodo.6573349

Published June 7, 2022 | Version v1

Conference paper Open

Deep HRTF Encoding & Interpolation: Exploring Spatial Correlations using Convolutional Neural Networks

1. UC San Diego

With the advancement in Deep Learning technologies, computers today are able to achieve unimaginable success in several domains involving images and audio. One such area in 3D audio where the applications of deep learning can be promising is in binaural sound localization for headphones, which requires individualized and accurate representations of the filtering effects of the anthropometric measurements of a listening body. Such filters often are stored as a set of Head Related Impulse Responses (HRIRs) or in their frequency domain representations, Head Related Transfer Functions (HRTFs), for specific individuals. A challenge in applying deep learning networks in this area is the lack of availability of vast numbers of complete and accurate HRTF datasets, which is known to cause networks to easily over-fit to the training data. As opposed to images, where the correlations between pixels are more statistical, the correlations that HRTFs share in space are expected to be more a function of the body and pinna reflections. We hypothesize that these spatial correlations between the elements of an HRTF set could be learned using Deep Convolutional Neural Networks (DCNNs). In this work, we first present a CNN-based auto-encoding strategy for HRTF encoding and then we use the learned auto-encoder to provide an alternate solution for the interpolation of HRTFs from a sparse distribution of HRTFs in space. We thereby conclude that DCNNs are capable of achieving results that are comparable to other non deep learning based approaches, in spite of using only a few tens of data points.

Files

45.pdf

Files (632.0 kB)

Name	Size	Download all
45.pdf md5:b38144fbc13441d303d9dd3b4246ee03	632.0 kB	Preview Download

	All versions	This version
Views	262	145
Downloads	260	137
Data volume	173.9 MB	92.9 MB

Deep HRTF Encoding & Interpolation: Exploring Spatial Correlations using Convolutional Neural Networks

Creators

Description

Files

45.pdf

Files (632.0 kB)