Synthesis of a spatially band-limited plane wave in the time-domain using wave field synthesis

Wave Field Synthesis (WFS) is a spatial sound reproduction technique aiming at a physically accurate reconstruction of a desired sound field within an extended listening area. It was shown in a recent study that the accuracy of the synthesized sound field can be improved in a local area by applying a spatial band-limitation to the driving function. However, the computational complexity of the frequency-domain driving function is demanding because of the involved Bessel functions. In this paper, a time-domain WFS driving function is introduced for the synthesis of a spatially band-limited plane wave. The driving function is obtained based on a time-domain representation of the sound field which is given as a superposition of plane waves with time-varying direction and amplitude. The performance of the proposed approach is evaluated by numerical simulations. Practical issues regarding the discretization of the analytic driving function and dynamic range control are discussed.


I. INTRODUCTION
Wave Field Synthesis (WFS) and Near-field Compensated Higher-order Ambisonics (NFC-HOA) are two well known sound field synthesis techniques developed for a physically accurate reconstruction of a desired sound field using a loudspeaker array [1]- [5]. The loudspeakers, termed secondary sources, are driven in such a way that the superposition of the individual sound fields matches the desired field. One of the main difference between WFS and NFC-HOA is the mathematical representation of the desired sound field, from which the driving function is derived. In NFC-HOA, the sound field as well as the driving function are given as a spherical/circular harmonics expansion with respect to the center of the spherical/circular secondary source distribution. In a practical system, only a finite number of secondary sources are available which constitutes a discretization of the secondary source distribution leading to spectral repetitions in the harmonics domain [6]. To avoid spectral overlap, i.e. spatial aliasing, the harmonics expansion is truncated to a finite order [7,Sec. 4.4.1]. Typical NFC-HOA driving functions thus have finite spatial bandwidth, whereas conventional WFS driving functions exhibit infinite spatial bandwidth [8].
While such a spatial band-limitation causes deviations in representing a sound field, there are also benefits that can be exploited. As observed in a number of studies [4], [9], the synthesized sound field of NFC-HOA exhibits high accuracy at the center of expansion. This comes at the cost of impaired This research was supported by DFG SP 1295/7-1. physical and perceptual qualities at off-center positions [8]- [10]. Therefore, NFC-HOA can be considered as a sound field synthesis technique aiming at an accurate synthesis within a local region, called local sound field synthesis. The location of the local listening area can be moved by re-expanding the sound field with respect to the center of the respective local listening area [11], [12].
The same principle can be applied to WFS. In [13], the WFS driving functions were derived based on an order-limited circular/spherical harmonics expansion of the desired sound field. As expected, an improved performance was achieved in the region centered around the expansion point. However, the driving functions introduced in [13, Table. 1] are available only in the time-frequency domain, and their implementation is computationally demanding as it requires the computation of Bessel and spherical Bessel functions.
In this paper, a time-domain driving function is proposed for the synthesis of a Dirac-shaped plane wave that exhibits a limited spatial bandwidth. The driving function is derived based on an analytic time-domain representation of the sound field (Sec. II). Two different derivations are introduced both of which lead to the same result. One is based on a circular harmonics expansion while the other on a plane wave decomposition. The proposed WFS driving function (Sec. III) is able to synthesize the sound field of a plane wave with a significantly reduced computational complexity. The spatio-temporal structure and the spectral responses of the synthesized sound fields are investigated by numerical simulations (Sec. V). Nomenclature A position vector is denoted by lowercase boldface x. The angular frequency is denoted by ω = 2πf with f being the temporal frequency. The speed of sound is denoted by c and the imaginary unit by i. Plane waves are assumed to propagate parallel to the xy-plane. The propagation direction is characterized by a normal vector n pw = cos φ pw , sin φ pw , 0 where φ pw denotes the azimuth angle.

A. Circular Harmonics Expansion
The sound field of a monochromatic plane wave e −i ω c r cos(φ−φpw) can be represented as a circular harmonics expansion for a given expansion center x c [14, (2.44)]. By truncating the order to ±M , the infinite summation reduces to a finite series where J m (·) denotes the Bessel function of order m. The inner product of two vectors is denoted by ·, · . Note that the truncated expansion is now an approximation of the plane wave with a limited accuracy [15]. The local coordinate system is defined as x = x − x c = (r cos φ , r sin φ , 0). Without loss of generality, x c = 0 is assumed in the remainder, thus r = r and φ = φ.
The inverse Fourier transform of the Bessel function in (1) reads which can be obtained by reformulating Eq. (11.4.24) in [16]. T m (·) denotes the Chebyshev polynomial of the first kind of degree m, and Π ct 2r a rectangular function The time-domain representation of (1) is thus obtained by replacing J m ( ω c r) with (2), s(x, t) In the second equality, − ct r and ( r c ) 2 − t 2 were replaced by cos β(t) and | sin β(t)|, respectively. Equation (5) can be further simplified by exploiting T m (cos θ) = cos(mθ) [16, (22.3.15)] and the addition formula of cosine functions [16, (4.3

B. Plane Wave Decomposition
The time-domain representation can also be obtained from a plane wave decomposition where the sound field is given as a superposition of a continuum of plane waves.S(α, ω) denotes the spectral weight of the plane wave propagating in the direction of n α = (cos α, sin α, 0). The time-domain representation of (8) reads wheres(α, t) denotes the inverse Fourier transform ofS(α, ω) and * t the time-domain convolution. For a spatially bandlimited plane wave, the plane wave decomposition coefficients are merely weights given as a periodic sinc function [17, (4.95)]S(α) =s(α) = D M (α − φ pw ) and the time-domain plane wave decomposition is given as Since the argument of δ t − r c cos(φ − α) has only simple (10) is thus rewritten as which leads to the same result as (5). Equation (11) shows more clearly that the sound field is a superposition of two plane waves. For brevity, the propagation angles are denoted as φ ± pw (t) ≡ φ+π±β(t) with the corresponding normal vectors For t = ± r c , there is only one plane wave propagating in φ and φ + π respectively.
The spatial structure of a spatially band-limited plane wave is illustrated in Fig. 1. The wavefronts of two instantaneous plane waves intersect at x. The circular wave front of radius |ct| is a result of the superimposed plane waves, which converge toward the expansion center for t < 0 and then diverge for t > 0.

III. WAVE FIELD SYNTHESIS
The theory of WFS is based on the Kirchhoff-Helmholtz integral equation, which states that a homogeneous sound field within a source-free volume can be recreated by a continuous distribution of elementary sources on the surface. A highfrequency/far-field approximation is applied to the integral equation, thereby obtaining a formula that only consists of secondary monopoles. As a result, the driving function is given as the directional gradient of the desired sound field evaluated on the surface. For a more comprehensive introduction to WFS, the reader is referred to [3] and [5,Ch. 2].
In this paper, the desired sound field is synthesized by using a set of secondary point sources located in the xy-plane. Such a configuration is called 2.5-dimensional, due to the mismatch of the dimensionality between the sound field (2D) and the acoustic properties of the secondary sources (3D).

A. Driving Functions
The 2.5D WFS driving function for a plane wave with reference point x ref reads [5, (2.177) and (2.178)] in the time-frequency domain and in the time domain. As illustrated in Fig. 2, the normal vector n 0 (x 0 ) = (cos φ n , sin φ n , 0) is an inward pointing vector defined for a given secondary source at x 0 ∈ ∂Ω. The secondary source selection window a(x 0 ) assures that the propagation of the synthesized sound field is correct [19, (5)]. For virtual plane waves, it is given as The term x 0 − x ref mediates the amplitude decay errors occurring in 2.5D WFS. In this paper, the expansion center is chosen as the reference point, The spectral weight i ω c in (13) is realized by the preequalization filter h eq (t) which is applied to the individual source signals or to the loudspeaker signals [19]. For convenience, h eq (t) is omitted in the following derivation.
Instantaneous plane waves passing a secondary source ∈ ∂Ω. The plane waves and propagates in the direction of φ0 + π + β(t) and φ0 + π − β(t) respectively. The angles between the plane wave directions n ± pw and the normal vector n 0 are indicated by shaded arcs. According to the secondary source criteria (15), the blue plane wave is activated whereas the red plane wave is deactivated, since n + pw , n 0 (x 0 ) > 0 and n − pw , n 0 (x 0 ) < 0.
For the synthesis of a spatially band-limited plane wave, the driving function (14) is applied to the instantaneous plane waves in (7) and thus Note the time-dependencies of the secondary source selection windows and the scalar products due to the varying plane wave directions. An alternative time-domain driving function can be derived from plane wave decomposition (9) d(x 0 , t) = 2r π 2π 0 a(x 0 , α) n α , n 0 (x 0 ) (17) where the plane wave driving function (14) is applied to the individual plane waves in (9).

A. Discretization
To be implemented in a practical system, the driving function (16) has to be sampled uniformly in the time domain, i.e. t n = n/f s . Note that a sampling in the time domain constitutes a sampling of the plane wave directions φ ± pw (t) for |t| ≤ r c . Due to the nonlinear dependency of φ ± pw (t) on t, the sampled plane wave angles φ ± pw (n/f s ) have a nonuniform distribution in [0, 2π). For a given sampling frequency f s , the number of plane waves is N pw ≈ 2r c f s . For each secondary source, a different set of plane waves are selected.
A higher temporal resolution can be achieved by computing the driving function at a higher sampling rate, and then downsampling it to the original sampling rate. This will reduce the time-domain aliasing to some extent but not completely, since the Bessel functions in (1) are not band-limited [16,Ch. 9]. An analytic anti-aliasing filter can be used to suppress the aliasing artifacts, but this is out of the scope of this paper. Using a higher sampling rate improves also the spatial resolution of the driving function, as the number of plane waves increases proportional to f s .
The second driving function (17) is discretized separately in the temporal and spatial domain. In the time domain, the individual plane waves denoted by the Dirac delta function δ(·) are realized by rounded integer delays or by fractional delay filters [20]. In either case, the signal carried by the plane wave is low-pass filtered thereby avoiding temporal aliasing. In the spatial domain, the plane wave angle α is discretized, and thus the integral is replaced by a summation. In this paper, uniform sampling is considered, α ν = 2πν Npw , ν = 0, . . . , N pw −1. Although both sampling schemes are independent from each other, there is an interrelation when a large number of plane waves are synthesized in order to achieve a high spatial resolution. In this case, it is crucial to maintain a high temporal resolution by using fractional delay filters. Otherwise (using integer delays), two or more plane waves may have identical driving functions and increasing the number of plane waves has no benefit, similar to the observation in [21]. The number of required fractional delay filters is proportional to the number of plane waves and to the number of secondary sources. Therefore, higher spatial resolution comes at the cost of higher computational complexity.

B. Dynamic Range Control
If the driving function is sampled close to t = ± r c , the term 1 | sin β(t)| in (16) tends to infinity. To avoid numerical problems, a soft-knee limiting function, is applied to the latter term, where the threshold A thr determines its maximum value. Note that the driving function is more compressed around t = ± r c where most of the highfrequency components are present. Such a dynamic range control thus has a low-pass filtering effect and alleviates the effects of temporal aliasing.

V. EVALUATION
The driving functions proposed in this paper are used for the synthesis of a virtual plane wave. The plane wave is assumed to propagate parallel to the xy-plane with an azimuth angle φ pw = − π 2 . The desired sound field is expanded with respect to an expansion center up to the order of M = 29. of the plane wave which passes the origin at t = 0. For a practical implementation, a pre-delay has to be applied.
The driving functions are used for the synthesis of the desired sound field, and the time snapshots (t = 0 ms) of the synthesized sound fields are shown in Fig. 3 (bottom). The direction of the desired plane wave is indicated by black arrows, and the base of each arrow is placed on the center of expansion x c . The narrow wavefront observed around x c indicates that the local sound field exhibits a high accuracy throughout a wide frequency range. In lateral positions, on the other hand, a low-pass filtered response is expected because of the wide wavefront. This agrees with the observation in the previous study [13]. Note also that the sound field for x c = (0, 0, 0) resembles the sound field of a typical NFC-HOA system [7,Fig. 5.16] where a similar spatial band-limitation was applied.
The spectral properties of the synthesized sound fields are examined in Fig. 4. The results in the bottom row are computed with an oversampled rate f ↑ s = 2 × f s = 88.2 kHz. The leftmost column shows the influence of the dynamic range control introduced in IV-B. The higher the threshold, the more high frequency components are observed. Nearly flat responses were obtained with A thr = 17 (top) and A thr = 24 (bottom).
The right six figures in Fig. 4 show the frequency responses of the synthesized sound field at different receiver positions which coincide with the respective expansion center x c . Here, the proposed driving function (CHT) is compared with the driving function given in (17). The latter was implemented with and without fractional delay filters (PWD-f and PWD-i, respectively). For PWD-f, 20-th order Lagrange filters were used. As can be seen, the proposed driving function (CHT) achieves a comparable performance to the implementation using fractional delay filters, with a much lower computational cost.
Oversampling the driving function clearly improves the spectral accuracy. It removes the high frequency bump which is a result of temporal aliasing. The low frequency roll-off is a typical characteristic of WFS, which is attributed to the high-frequency approximations mentioned in III.

VI. CONCLUSION
In this paper, a time-domain WFS driving function is introduced for the synthesis of a spatially band-limited plane wave. It is based on the analytic time-domain representation of the sound field, which was derived directly from the circular harmonics expansion as well as from the plane wave decomposition. It was shown that the time-domain sound field can be interpreted as a continuum of plane waves, and the amplitudes and directions of the instantaneous plane waves were identified. Based on the analytic representation, the WFS driving function was derived and used in a 2.5D scenario. The proposed driving function is able to synthesize a plane wave with an improved local accuracy, which suggests its usage for local sound field synthesis. The spectral distortion of the sound field is comparable to the direct implementation of plane wave decomposition, but with a much lower computational complexity.
It is still an open question how to perform the dynamic range control and anti-aliasing filtering at the same time. Although an optimal value for A thr can be found empirically, a more sophisticated method is desirable. Perceptual evaluation and comparison with other existing local WFS approaches [22], [23] remain as future work.