Building blocks for a complex-valued transformer architecture

Eilers, Florian; Jiang, Xiaoyi

doi:10.5281/zenodo.7660784

Published February 21, 2023 | Version v1

Conference paper Open

Building blocks for a complex-valued transformer architecture

1. University of Münster

Most deep learning pipelines are built on real-valued operations to deal with real-valued inputs such as images, speech or music signals. However, a lot of applications naturally make use of complex-valued signals or images, such as MRI or remote sensing. Additionally the Fourier transform of signals is complex-valued and has numerous applications. We aim to make deep learning directly applicable to these complex-valued signals without using projections into R^2. Thus we add to the recent developments of complex-valued neural networks by presenting building blocks to transfer the transformer architecture to the complex domain. We present multiple versions of a complex-valued Scaled Dot-Product Attention mechanism as well as a complex-valued layer normalization. We test on a classification and a sequence generation task on the MusicNet dataset and show improved robustness to overfitting while maintaining on-par performance when compared to the real-valued transformer architecture.

Files

ICASSP_2023.pdf

Files (347.4 kB)

Name	Size	Download all
ICASSP_2023.pdf md5:88e3dbda8c99b9f012846f60fe86d165	347.4 kB	Preview Download

	All versions	This version
Views	176	176
Downloads	133	133
Data volume	48.3 MB	48.3 MB

Building blocks for a complex-valued transformer architecture

Authors/Creators

Description

Files

ICASSP_2023.pdf

Files (347.4 kB)