A wavelet‐based filtering of ensemble background‐error variances

Background‐error variances estimated from a small‐size ensemble of data assimilations need to be filtered because of the associated sampling noise. Previous studies showed that objective spectral filtering is efficient in reducing this noise, while preserving relevant features to a large extent. However, since such filters are homogeneous, they tend to smooth small‐scale structures of interest.


Introduction
The background-error covariance matrix B plays a central role in data assimilation schemes by weighting the information from the observations and the background state in the analysis.
Recently there has been growing interest in estimating background-error covariances from ensemble data assimilation systems, either in the Kalman filter context (Evensen, 2003) or in the variational framework (Kucukkaraca and Fisher, 2006;Raynaud et al., 2009;Bonavita et al., 2011).
However, the high computational cost of such ensembles in operational applications limits the ensemble size (namely O(100) members), leading to a significant sampling noise which has to be filtered out. The goal of the filtering step is to remove the noise while retaining as much as possible the important signal features. Traditionally, this is achieved by linear processing such as Wiener filtering (Wiener, 1949).
Previous studies on the filtering of ensemble-based variances (Raynaud et al., 2008(Raynaud et al., , 2009) provided useful information with regard to the associated sampling noise, such as its statistical properties. Moreover, Wiener filtering of ensemble variances has been successfully implemented in large-scale applications at Météo-France (Raynaud et al., 2009) and the European Centre for Medium-range Weather Forecasts (ECMWF) (Bonavita et al., 2011).
The Wiener filter, however, optimizes the trade-off between an averaging of the signal discontinuities and the removal of the noise in the smooth regions in order to minimize the mean-square error. As a result, some noise is left in the smooth regions while the discontinuities are averaged a little. Discontinuities in background-error variance fields typically correspond to high forecast errors associated with severe weather events, e.g. midlatitude storms and tropical cyclones. The averaging of such error structures, which has for instance been observed by Bonavita et al. (2011), can then result in a smaller impact of relevant observations in these regions during the assimilation step.
In order to preserve such coherent features in variance fields, the filter has to adapt to the local structure of the signal. This could be achieved by performing the filtering either in grid-point space or wavelet space. A first contribution to heterogeneous variance filtering in grid-point space has been proposed by Raynaud and Pannekoucke (2012) based on the integration of the heterogeneous diffusion equation. In the present paper, the application of nonlinear filtering in wavelet space is examined.
The wavelet transform, thanks to its excellent localization property, has rapidly become an essential signal-and imageprocessing tool for a variety of applications, including denoising. Denoising by wavelet coefficient thresholding is a commonly used method and was first proposed by Donoho and Johnstone (1994). The algorithm compares each wavelet coefficient of the noisy signal to a given threshold: if the coefficient is smaller than the threshold then it is set to zero; otherwise it is kept or modified (depending on the thresholding rule). The idea behind thresholding is to distinguish between the insignificant coefficients likely due to noise, and the significant coefficients consisting of important signal structures. The denoised signal is then reconstructed from the selected coefficients.
The paper is organized as follows. Section 2 introduces the technique of wavelet coefficients thresholding. Section 3 reviews some properties of the sampling noise associated with the estimation of background-error variances from an ensemble of forecasts. Wavelet thresholding is then applied in section 4 to 1D analytical signals corrupted by a Gaussian white noise. The case of a correlated and heterogeneous noise, as often encountered in real applications, is examined in section 5. Section 6 considers the extension to a more realistic 2D framework, using background-error statistics of the Arome convective-scale model. Finally, conclusions and perspectives are given in section 7.

Denoising by wavelet thresholding
In this section, we introduce the mathematical formalism associated with the denoising by wavelet thresholding, as initially proposed by Donoho and Johnstone (1994) to denoise signals affected by a Gaussian white noise.

Theoretical aspects
We consider a discrete signal S of size n = 2 J , affected by a Gaussian white noise W of mean zero and variance σ 2 W , resulting in a noisy signal X: We decompose the noisy signal into an orthogonal wavelet series X = J−1 j=0 2 j −1 i=0X i,j ψ i,j , where ψ i,j is the wavelet function at position index i for scale j andX i,j = ψ i,j |X is the associated wavelet coefficient, ·|· denoting the inner product (Mallat, 1999). The wavelet coefficients {X i,j } i=0,2 j −1 at scale j thus define an approximation of X on a grid whose resolution depends on j: the finer the scale the higher the resolution.
Denoising by thresholding wavelet coefficients consists of keeping only the coefficients whose modulus is above a given threshold value T: The denoised signalX is finally reconstructed using an Wavelet thresholding is first motivated by the fact that the decorrelating property of the wavelet transform reveals sparsity of the signal if any, i.e. most wavelet coefficients are close to zero (Mallat, 1999). Moreover, since the noise is spread out equally over all coefficients, if the noise level is not too high it is then possible to discriminate between signal and noise coefficients.
The idea behind wavelet thresholding is to test each wavelet coefficient in order to check if it is compatible with a Gaussian white noise with standard deviation σ W . This can be achieved by performing a statistical test, allowing us to verify that particular properties of the noise are consistent with this Gaussian distribution. A possible statistical property is the maximum magnitude that can be encountered when sampling a Gaussian random variable of standard deviation σ W and size n. As detailed in Appendix A, this maximum magnitude should be lower than The sampling size n can be understood as the return time of the extreme event consisting of exceeding a magnitude strictly smaller than T D (see Appendix A for details). Since a Gaussian white noise in grid-point space is transformed into an equivalent Gaussian white noise in an orthogonal wavelet representation, the test can be performed on the wavelet coefficients. The noise variance is then calculated is the noise at grid-point l andW i,j is a wavelet coefficient of noise. Therefore, a wavelet coefficient whose modulus is larger than T D is not compatible with the Gaussian assumption. In that case, the coefficient carries more signal information than noise and contributes toX.
It is worth noting that T D is equal to the universal threshold proposed by Donoho and Johnstone (1994), which results in an estimate asymptotically optimal (when n → ∞) in the minimax sense (i.e. minimizing the maximum quadratic error (Mallat, 1999)). This threshold is called universal since it depends on the sampling size n and on the variance of the noise σ 2 W , but not on the signal itself.

Noise variance estimation
When the statistical properties of the noise are known or can be calculated with an appropriate model, the determination of the optimal threshold is straightforward (Donoho and Johnstone, 1994). However, the noise variance is unknown in many situations and has to be estimated. Different methods have been proposed, such as the median absolute deviation (MAD), which estimates the level of noise by taking the median of the modulus of the smallest-scale wavelet coefficients (Mallat, 1999). An alternative approach was introduced by Farge et al. (1999) and Azzalini et al. (2004), based on a recursive estimation of the noise variance and the threshold.
The recursive approach of Azzalini et al. (2004) proceeds as follows. The wavelet signalX is split into a coherent (i.e. noise-free) partX c and an incoherent (i.e. purely noisy) partX inc . The signal is first considered as incoherent (i.e. only due to noise):X inc =X; thus σ 2 √ 2 ln(n). Wavelet coefficients above T 0 are then added to the coherent part while wavelet coefficients below T 0 remain in the incoherent partX The coherent and incoherent parts of the signal are thus recursively constructed, at loop k + 1, based on the estimates √ 2 ln(n). This algorithm is repeated until the number N W of nonzero coefficients in the incoherent part converges, i.e. N W,k+1 = N W,k . At the end of this recursive algorithm, σ W = σ W,k , T D = T k , and the denoised signal is given bŷ X = i,jX c i,j ψ i,j . This algorithm is stable and converges with a finite number of iterations bounded from above by the number of samples n, although in practice very few iterations are needed (Azzalini et al., 2004).
This iterative process is illustrated in Figure 1. Since a white noise is isotropic it is spherical on an orthogonal basis, and the spheres correspond to the maximum noise magnitude (i.e. the threshold) at iterations 0 and 1. Because the initial noise variance σ W,0 is large, most of the wavelet coefficients are smaller than the calculated threshold T 0 . Thus only a few coefficients are larger than the threshold (they correspond to the bold arrows) and are added to the coherent part of the signal. After the first iteration, the estimated noise variance σ W,1 is then smaller than σ W,0 ; thus T 1 < T 0 and the wavelet coefficients such that |X inc i,j | > T 1 (dashed arrows) are added to the coherent signal.

Adaptive local spatial filtering
It is interesting to note that the wavelet thresholding is equivalent to estimating the signal with a filtering that is locally adapted to the signal regularity. This property follows from the fact that the wavelet transform of a function f at scale j and position x j (i) locally measures the variation of f in a neighbourhood of x j (i) whose size is proportional to j (Mallat, 1999, p. 165). Rapid transitions in a signal thus create large wavelet coefficients at fine scales. Given that the wavelet-thresholding selectively sets to zero all coefficients below a threshold T, it thus performs an adaptive filtering that depends on the amplitude of the wavelet coefficients. If |X i,j | > T then the coefficients are relatively large and thus are in the neighbourhood of sharp transitions of f at fine scales. Keeping them avoids smoothing sharp signal variations. In the regions where |X i,j | ≤ T, the coefficients are likely to be small, which means that f is smooth. The noise is then filtered out by setting the wavelet coefficients to zero.
While this section introduced the reasoning behind wavelet thresholding for some noisy signal, in the remainder of the paper this technique is applied to remove noise in a statistic. Here the statistic considered is the sample variance estimated from an ensemble of forecasts.

Estimating variances from ensemble forecasts
This section reviews major results regarding the estimation of background-error variances from an ensemble of forecasts. It is also shown how the noise variance required for the wavelet thresholding (Eq. (1)) can be analytically calculated.

Ensemble variances and their sampling statistics
Background-error variancesṽ estimated from an ensemble of N background errors of size n are affected by a sampling noise, denoted v e , which directly arises from the finite size of the ensemble: where E stands for the expectation operator andṽ = E[ṽ] is the expectation of the ensemble-based variances, which will be referred to as the noise-free estimated variances. It may be mentioned that this sampling error is non-Gaussian; however, the central limit theorem ensures that the sampling distribution of N k=1 v e approaches the Gaussian distribution as the ensemble size N → ∞.
The statistical properties of this sampling noise have been derived by Raynaud et al. (2009), under the assumption of Gaussian background errors. The spatial covariance of the sampling noise is given by whereB is the n × n ensemble-estimated B matrix, B = E[B] and • stands for the Hadamard product (i.e. an element-wise product). It follows that the noise variance is given by However, this formula cannot be used in practice to calculate local noise variances since we need to know in advance the noise-free signalṽ .
In the case of a white noise, the noise energy is equally distributed among all scales and the noise variance is then equal to the average noise variance: It thus comes from Eq. (3) that

Estimation of the noise variance
It has been mentioned in section 2.2 that the noise variance σ 2 W may be obtained through a recursive method. On the other hand, Eq. (4) provides a direct estimate of the noise variance. The application of this analytical relation is now detailed.
SinceB is unknown in practice, a possible solution to estimate the white noise variance according to Eq. (4) is to replace Tr(B •B ) by Tr(B •B). In order to better understand the influence of the finite ensemble size on the estimation of Tr(B •B), it is interesting to examine the sampling properties of this random variable, in particular its statistical expectation and its standard deviation.
It is shown in Appendix B that  N). Moreover, it can be seen that the standard deviation has a minor impact, compared to the bias, for small ensembles. With a 10-member ensemble, for instance, this ratio is around 7%. This thus shows that with current operational ensemble sizes (namely between 10 and O(100) members), Tr(B •B) is a quite accurate estimation of Tr(B •B ). The white noise variance can then be estimated by In the context of the ensemble estimation of backgrounderror variances, the statistical properties of the associated sampling noise thus allow us to directly calculate a relatively accurate estimation of the average noise level.

Denoising of 1D variance fields corrupted by a Gaussian white noise
The application of wavelet thresholding to ensemble-based variances is illustrated in this section by estimating variances in an idealized experimental one-dimensional set-up.
An ensemble of N random error realizations of size n is generated by randomizing a prescribed 'true' background-error covariance matrix (Fisher and Courtier, 1995). Variances estimated from this ensemble are then decomposed into an orthogonal wavelet series, and the wavelet coefficients are thresholded using the threshold value calculated from Eq. (1). The filtered variances are finally obtained from an inverse wavelet transform of the selected wavelet coefficients. For the experiments presented in this paper, we use the Coiflet-5 wavelets (i.e. with five vanishing moments; Mallat, 1999).
The domain is a circle of radius a = 6370 km, which corresponds to the Earth's great circle. This circle is divided into n = 512 = 2 9 equally spaced grid points.

The prescribed covariance matrix
Homogeneous and isotropic correlations are obtained from the Gaussian function where x is a point on the circle, r is a separation value and L b is the correlation length-scale (Daley, 1991;. Following Pannekoucke et al. (2007), heterogeneous correlations are then computed using a c-stretching Schmidt transformation (Courtier and Geleyn, 1988), adapted to the circle and defined by The associated correlation length-scales are sharper in the centre of the domain. On the other hand, the prescribed variances v are relatively smooth over a large part of the domain with a value of 1, and there is a sharp transition in the centre of the domain with an increase of the variances by a factor 3 ( Figure 3). This rapid increase of variances may simulate what is observed in the vicinity of low-predictability events (e.g. midlatitude storms, tropical cyclones).

Filtering results
In this section, the efficiency of a wavelet thresholding of estimated variances is examined in the context of a white noise. This is achieved by setting a very low value for the background-error length-scale (i.e. L b 1) in constructing the true covariance matrix (Eq. (7)). Moreover, the noise variance required for the calculation of the threshold value is calculated using both the recursive algorithm (section 2.2) and the theoretical formula (Eq. (6)). Figure 3(a) shows the prescribed variances along with the raw variances estimated from a 50-member ensemble. The denoised variance field after a thresholding of wavelet coefficients is shown in Figure 3(b). The sampling noise is removed to a large extent, while the spatial variations of the prescribed variances are preserved. The relative error of the estimated variances is reduced from 20% to 7% on average. It may be mentioned that threshold values calculated using the recursive algorithm and the theoretical formula lead to identical denoised variances.
The efficiency of the wavelet-thresholding method is related to the separation between signal and noise wavelet coefficients. Figure 4 presents the histograms of the wavelet coefficients for the raw variancesX i,j and for the noiseW i,j . It turns out that the noise coefficients are concentrated within the range [−T D , T D ] with T D ≈ 0.92. Moreover, the noise dominates the signal (i.e.X i,j ≈W i,j ) within the range [−T D , T D ]. As a result, the useful signal corresponds to the coefficients whose modulus is larger than T D , and thus it can be efficiently retrieved through wavelet coefficient thresholding.
With smaller ensemble sizes (namely O(50)), the larger amplitude of the noise makes the discrimination between signal and noise more difficult. Therefore, there is some residual noise after the wavelet thresholding that could be avoided by slightly increasing the threshold value, for instance (not shown).
Under the assumption of a Gaussian white noise, wavelet thresholding is thus a straightforward and efficient method to extract the signal of interest. Moreover, one advantage of wavelet thresholding is that it does not require any trial and error tuning.

Denoising of 1D variance fields corrupted by a correlated and heterogeneous noise
In this section, the efficiency of wavelet thresholding is examined in the presence of a correlated and heterogeneous noise.

Spatial structure of the noise in real applications
The theoretical basis of the wavelet thresholding described in section 2 relies on the assumption of a Gaussian white noise. However, in practical applications the sampling noise associated with the estimation of background-error variances is often correlated. This can be seen from Eq.
(3), which implies the following relationship (Raynaud et al., 2009) between the spatial correlation length scales of sampling noise (denoted L v e ) and of background error (denoted L b ): The assumption of a white noise is thus verified when background errors are not or only slightly correlated, which may be the case in dynamical regions, for instance (e.g. in the vicinity of lows and troughs). On the other hand, when background errors are correlated, the associated sampling noise is correlated as well.
In addition, according to several studies (Thépaut et al., 1996;Ingleby, 2001;Pannekoucke et al., 2007) backgrounderror correlations in realistic numerical weather prediction (NWP) applications are heterogeneous (i.e. L b is not constant in space), which implies that the associated noise is also heterogeneous.

'Scale-dependent' threshold
The method of wavelet thresholding has been generalized to correlated noise (Johnstone and Silverman, 1997). In this case, one can apply a different threshold for each scale j: where σ (j) is the noise standard deviation associated with scale j and n j = 2 j is the number of wavelet coefficients at scale j. σ (j) and T D (j) could be estimated with a scale-wise extension of the recursive algorithm presented in section 2.2 (Nguyen et al., 2011). The difficulty of this approach lies in the estimation of the variance of the wavelet coefficients of the noise at each scale.
Two problems can limit the quality of this estimation. First there is a statistical limitation, since the variance at scale j is estimated from 2 j realizations. Since the standard deviation of the relative error in the estimated standard deviation σ (j) is equal to 2 2 j −1 , a relative error smaller than 10% can be achieved only for scales j ≥ 8. Secondly, at each scale the noise variance is estimated, it is necessary that only a few coefficients are due to the signal. In general, this is only the case at the smallest scales. It may then be expected that the larger scale the noise, the less efficient the denoising.
Finally, the 'scale-dependent' generalization of wavelet thresholding is particularly adapted to a homogeneous noise. In that case, the noise variance is constant within scales so that σ (j) can be accurately calculated from the wavelet coefficients of the noise at scale j. If the noise is heterogeneous then the calculated σ (j) corresponds to the average noise level at scale j and the 'scale-dependent' thresholding may then be suboptimal. In the next section, we propose an alternative method to deal with a correlated and heterogeneous noise.

'Equivalent' white noise threshold
Because of the limitations raised by the 'scale-dependent' formulation in the presence of a heterogeneous noise, an alternative solution is detailed below. The threshold value is calculated using the global universal threshold (Eq. (1)), under the assumption of a white noise with standard deviation α × σ W : where σ W = Tr(E[v e v e T ])/n is the average standard deviation of the correlated noise. It may be mentioned that, since the noise is correlated, the recursive algorithm described in section 2.2 is no longer efficient to calculate σ W . In that case, σ W is estimated from the theoretical formulation (Eq. (6)), leading to A graphical interpretation of this choice is given in Figure 5. White noises are represented by spheres, while a correlated noise is represented by an ellipsoid (which reflects the variation of the noise level with the direction). The formulation of the threshold in Eq. (10) assumes that the correlated noise is replaced by an 'equivalent' white noise with standard deviation α × σ W . Using a multiplicative factor α ≥ 1 helps to reduce some residual noise arising from the scales where the noise level is above the average level (i.e. σ (j) ≥ σ W ). A trivial upper bound for the parameter α is equal to max j σ (j)/σ W (dashed-dotted sphere). However, using this upper bound would result in setting too many coefficients to zero. The choice of the parameter α is thus based on the optimization of the trade-off between the removal of the noise (where σ (j) > σ W ) and the averaging of the useful signal (where σ (j) < σ W ). A possible choice for α could be, for instance, α = median( σ (j) σ W ≥ 1).

Filtering results
The experimental set-up is as described in section 4, except that a non-zero correlation length-scale is now used. L b is set to 250 km in Eq. (7), which results in local length scales that vary between 100 km in the centre of the domain and 600 km at the edges of the domain (Figure 6). It may be mentioned that such length-scale values are close to those encountered in real applications (e.g. Figures 11 and 12 of . This experimental setting then ensures a realistic level of correlation and heterogeneity for the noise. In accordance with the prescribed local background-error length scales, the noise presents relatively short variations in the centre of the domain while it is larger scale elsewhere (Figure 7(a)). This is also supported by the scalogram of the noise (Figure 7(b)). As expected, the amplitude of small-scale coefficients tends to be larger in the centre of the domain. The root mean square error of estimated variances as a function of the parameter α, defined by 1 n n i=1 (v(α) − v ) 2 , wherev is the filtered variance estimate, is shown in Figure 8 for a 50-member ensemble. The curve indicates that there is an optimal value that minimizes the error. In the present case α opt = 2.3 and the associated wavelet thresholding leads to estimated variances (Figure 9(c)) whose relative error is around 10% on average (compared to 20% for raw estimated variances, Figure 9(a)).
The impact of the choice of α is illustrated in Figure 9(b) and (d). With α = 1 (Figure 9(b)), although the variances are much less noisy than the raw estimates, there remains some significant small-scale sampling noise in the vicinity of the variance peak. With α = 2 × α opt (Figure 9(d)), the  filtering is too strong and does not provide an accurate representation of the variance peak. With an appropriate choice of α, it thus turns out that the accuracy of the estimated variances is improved and is close to that obtained in the case of a Gaussian white noise.
Histograms of wavelet coefficients for signal and noise ( Figure 10) indicate the presence of noise coefficients larger than the threshold T D = σ W √ 2 ln(n) ≈ 0.92. This is consistent with the discussion in section 5.3 and justifies the use of a factor α > 1. With the optimal α = 2.3, T D ≈ 2.12 and one can see that the noise coefficients are concentrated within the range [−T D , T D ]. The use of this threshold thus enables all the noise coefficients to be removed, while the useful signal coefficients are preserved. It may also be mentioned that the noise is less Gaussian than in the white noise case.
For comparison purposes, Figure 11(a) presents the estimated variances after applying an optimized homogeneous filter. The signal is quite well represented by this homogeneous filter; however, as detailed in Raynaud and Pannekoucke (2012), the amplitude of the variance peak is   indicates that the performance of the wavelet thresholding is comparable to that of an optimized heterogeneous diffusion-based filter. This similar performance is encouraging since the wavelet thresholding is easier to implement than the diffusion-based heterogeneous filter. The main advantage is that it does not require knowledge of local background-error length scales, as is the case for the parametrization of diffusion in Raynaud and Pannekoucke (2012).

Denoising of variance fields in a convective-scale model
The extension of previous studies conducted in an idealized 1D framework is examined here in a more realistic 2D framework, using the convective-scale Arome-France model (Seity et al., 2011).

The dataset
The Arome-France model is a spectral, non-hydrostatic convective-scale model which has been running operationally at Météo-France since December 2008 over French territory with a 2.5 km horizontal resolution up to 1.35 hPa. The assimilation is based on a 3-hourly 3D-Var scheme. Ensemble variational assimilation, as used routinely at Météo-France for the global Arpège model  and at ECMWF (Bonavita et al., 2011), is now a common technique to estimate flow-dependent background-error covariances. Based on the same principles, an Arome ensemble assimilation can be generated, as detailed in Brousseau et al. (2011). In this set-up, the Arome EDA is based on perturbed Arome 3D-Var analysis/forecast steps, and takes its lateral boundary conditions from the Arpège ensemble assimilation.
The Arome ensemble used in this study has been run with six members over an autumn period in October/November 2011. Background-error variances and horizontal length scales have been estimated from this six-member ensemble for one particular day, and filtered with a simple spatial average to remove sampling noise. These filtered estimates then provide realistic features for backgrounderror covariance modelling as detailed in section 6.2.

Background-error covariance modelling
A common way of modelling inhomogeneous and anisotropic correlations is the use of the diffusion equation, as introduced by Weaver and Courtier (2001). Within this formulation, the shape of the modelled correlations is controlled by the local diffusion tensor, which is shown to be directly related to the local correlation length scales (Pannekoucke and Massart, 2008). More precisely, if L x and L y denote the zonal and meridional length scales respectively, Pannekoucke and Massart (2008) introduce the tensor When the pseudo-time integration range of the diffusion equation is fixed to [0, 1], the local diffusion tensor ν(x) at a position x = (x, y) is then related to by ν(x) = (x)/2.
Denoting by L the propagator associated with the time integration of the heterogeneous diffusion equation, the operator associated with the integration over [0, 1/2] is then L 1/2 . A correlation matrix C = C 1/2 C T/2 can be built from the product of the linear operator where W −1 is a metric tensor used to convert a canonical vector into a Dirac distribution (Pannekoucke and Massart, 2008), and is a diagonal matrix used as a normalization to force matrix C to have diagonal elements equal to one in accordance with a correlation matrix. A covariance matrix B = B 1/2 B T/2 with specified variance field can then be obtained from the diagonal matrix of the standard deviation field with The resulting background covariance field can be described by its variance field and its length-scale field. By construction, the variance field is exactly the one prescribed, i.e. the diagonal of 2 . On the other hand, the resulting length-scale field is a smooth version of the prescribed one.
An ensemble data assimilation can provide flowdependent estimates of background-error variances and correlations. In particular, Belo Pereira and Berre (2006) and Pannekoucke and Massart (2008) described economical formulae to estimate correlation length-scales from an ensemble. In this section, univariate horizontal backgrounderror covariances of temperature at 850 hPa (hereafter T850) are modelled with the diffusion approach, using ensemblebased variances and horizontal length scales estimated from the six-member Arome ensemble after a simple spatial average.
In order to test the efficiency of a wavelet denoising, an ensemble of random error realizations is then generated using this modelled 'true' covariance matrix. Variances estimated from this ensemble are filtered by applying a 2D wavelet thresholding over a domain discretized in 256 × 256 grid points.

Filtering results
Figure 12(a) and (b) presents the prescribed T850 variances and their estimation from a 50-member ensemble respectively. As expected, the large-scale signal of interest is contaminated by a smaller-scale sampling noise. Moreover, as indicated by Figure 13, this noise is strongly heterogeneous, with spatial variations which are larger in scale over oceans than over lands. Figure 12(c) presents the variances obtained after a wavelet thresholding using the universal threshold (Eq. (1)) and the noise variance calculated according to Eq. (6). This thresholding is quite efficient to remove most of the sampling noise, while preserving the relevant features. The filtered variances are relatively close to the prescribed ones, apart from some residual sampling noise. This residual sampling noise is consistent with results from the 1D framework (Figure 9(b)), and it can be reduced by inflating the threshold, as described in section 5.3 (Eq. (10)). As shown by Figure 12(d), using a multiplicative factor α = 2 enables the residual sampling noise to be removed to a large extent. This optimal α factor has been determined so that it minimizes the rms error, as detailed in section 5.4. Note that this value of α is different from that found in the 1D case, since α is expected to depend on the system configuration (i.e. the signal size and the ensemble size) and on the correlation degree of the noise.

Conclusions and perspectives
This paper introduces and tests a wavelet-based filter of ensemble background-error variances. The filtering is realized using a thresholding of the wavelet coefficients of the estimated variances. It is shown that this approach is equivalent to applying an adaptive local spatial filtering.
The most efficient application of wavelet thresholding is under the assumption of a Gaussian white noise. In that case, the threshold value is simply a function of the signal size and the noise standard deviation. Farge et al. (1999) proposed a recursive algorithm to estimate the noise level. On the other hand, an alternative solution to calculate the noise variance is based on knowledge of the statistical properties of the associated sampling noise (Raynaud et al., 2009). We showed that the average noise variance can be accurately derived from the raw estimate of background-error variances, even with a small ensemble.
The method has been illustrated for a simple 1D framework. Under the assumption of a white noise, the wavelet thresholding is shown to work well without any trial-and-error tuning. Moreover, the filtering performance is similar whether the noise variance is calculated with the recursive algorithm or the theoretical formula.
In practical applications, however, the noise is correlated and strongly heterogeneous. This makes the use of the universal global threshold and its 'scale-dependent' generalization suboptimal. An alternative method has been proposed, based on the assumption that the correlated noise can be replaced by a white noise with an appropriate variance to calculate a global threshold. The appropriate variance is larger than the average variance of the correlated noise, in order to remove some residual noise arising from the scales where the noise level is above the average level. The results indicate that this method provides variances whose accuracy is close to the white noise case. Moreover, this wavelet thresholding is shown to outperform the commonly used homogeneous filtering and it compares favourably to heterogeneous filtering in grid-point space.
These results are then confirmed in a more complex 2D framework, considering the Arome background-error statistics. The encouraging results of this study thus suggest that wavelet thresholding is a feasible and efficient approach for heterogeneous filtering of ensemble variances.
The formalism presented in this paper is well adapted to the variance filtering in limited area models. Extension to global models on the sphere could be considered using biorthogonal wavelets and frames (especially tight frames) as suggested by Pannekoucke (2009).