Published September 11, 2023 | Version v1
Video/Audio Open

Perceptually-motivated spatial audio codec for higher-order Ambisonics compression - Examples

  • 1. Aalto University
  • 2. Tampere University

Description

Scene-based spatial audio formats, such as Ambisonics, are playback system agnostic and may therefore be favoured for delivering immersive audio experiences to a wide-range of (potentially unknown) devices. The number of channels required to deliver high spatial resolution Ambisonic audio, however, can be prohibitive for low-bandwidth applications. Therefore, in this paper, a compression codec is proposed, which is based upon the higher-order Directional Audio Coding (HO-DirAC) model. The encoder downmixes the higher-order Ambisonics (HOA) input audio into a reduced number of signals, which are accompanied by spatial parameterization metadata. The downmixed audio is coded using a perceptual audio coder, whereas the metadata is grouped into perceptual bands, quantised, and downsampled. On the decoder side, low Ambisonic orders are fully recovered. Whereas, not fully recoverable high Ambisonic orders are synthesized based on the spatial metadata. The results of a listening test indicate that the proposed parametric spatial audio codec can improve the adopted perceptual coder, especially at low to medium-high bitrates, when applied to fifth-order HOA signals.

Files

listening_test_items.zip

Files (75.4 MB)

Name Size Download all
md5:caae5e245fa791da93fd2b3c3ae8f5cf
75.4 MB Preview Download