Sparsity promoting LMS for adaptive feedback cancellation

In hearing aids (HAs), the acoustic coupling between the microphone and the receiver results in the system becoming unstable under certain conditions and causes artifacts commonly referred to as whistling or howling. The least mean square (LMS) class of algorithms is commonly used to mitigate this by providing adaptive feedback cancellation (AFC). The speech quality after AFC and the amount of added stable gain (ASG) with AFC are used to assess these algorithms. In this paper, we introduce a variant of the LMS that promotes sparsity in estimating the acoustic feedback path. By using the lp norm as a diversity measure, the approach does not enforce, but takes advantage of sparsity when it exists. The performance in terms of speech quality, misalignment, and ASG of the proposed algorithm is compared with other proportionate-type LMS algorithms which also leverage sparsity in the feedback path. We demonstrate faster convergence compared with those algorithms, quality improvement of about 0.25 (on a 0–1 objective scale of the hearing-aid speech quality index (HASQI)), and about 5 dB ASG improvement compared with the normalized LMS (NLMS).


I. INTRODUCTION
In hearing aids (HAs), acoustic feedback is a well-known problem that causes howling and whistling effects that are annoying to the users.Under certain conditions, the receiver signal feeding back to the microphone will make the system become unstable.This not only degrades the audio quality but also limits the amount of amplification that can be provided by the HA.To overcome this many adaptive feedback cancellation (AFC) techniques have been proposed for modern HAs [1].
In AFC, an adaptive filter is continuously adjusting to approximate the impulse response (IR) of the acoustic feedback path.In the adaptation stage, least mean square (LMS) algorithms [2] are the most widely used techniques due to computational simplicity and their effectiveness.However, the estimate given by the LMS is inevitably biased due to the correlation between the incoming signal and the feedback signal [3].Several methods have been proposed to address this bias issue such as the filtered-X LMS (FXLMS) [4], [5], the prediction-error-method (PEM) based AFC [6], insertion of probe noise [7], phase modulation [8], PEM with frequency shifting [9], the dual-microphone approach [10], etc.
Assuming that the bias problem can be well handled by the above techniques, a question of interest is: Can we further improve the convergence behavior of the AFC from other aspects?Observing that typical feedback path IRs are (quasi) sparse as shown in Fig. 2, one might think of taking advantage of this sparseness for improvements.This can actually be carried out by the concept of proportionate adaptation that originated from the proportionate normalized LMS (PNLMS) algorithm [11].The main idea behind proportionate adaptation is to update each filter coefficient independently of the others by assigning to the corresponding step size a weight in proportion to the magnitude of the estimated coefficient.In other words, it redistributes the adaptation gains among all coefficients and emphasizes the large ones in order to speed up their convergence.
However, the original PNLMS has the problem that it is more beneficial for systems with very sparse structures.For AFC application where the feedback path IRs are usually quasi-sparse, other proportionate-type LMS algorithms can be more suitable.For example, the improved PNLMS (IPNLMS) [12] and the IPNLMS-l 0 [13] have the flexibility for identifying systems of different levels of sparsity.Attempts have been made to incorporate these proportionate algorithms into AFC [14]- [16] and improvements have been reported.However, these proportionate algorithms were not formally derived by minimizing any underlying objective functions so that their usage can be further optimized.Moreover, the parameters within these algorithms do not have direct connections to the sparsity degree of the underlying system they aim to identify.
In this paper, we propose a new AFC approach that incorporates sparsity in the feedback path estimate based on the pLMS algorithm of [17], [18].This method, called Sparsity promoting LMS (SLMS), takes advantage of sparseness in the feedback path when such sparsity is present while not enforcing it.By adding the l p norm as a diversity measure to the objective function of the ordinary LMS, the algorithm is derived using the affine scaling method [19] in the minimization procedure.The benefit of using the l p norm is brought by its direct connection to the system sparseness which provides a practical way of parameter selection.Our algorithm has the advantages of enjoying theoretical support, simpler parameter optimization, and more straightforward leverage of (quasi) sparsity in acoustic feedback paths.Simulation results using feedback paths measured with real HAs show that the SLMS outperforms other proportionate-type LMS algorithms in terms of audio quality, misalignment, and added stable gain (ASG).II.AFC SYSTEM Fig 1 .shows the typical AFC framework.The AFC filter W (z) is an FIR filter placed in parallel with the HA processing G(z) that continuously adjusts its coefficients to emulate the IR of the feedback path F (z). x(n) is the desired input signal and d(n) is the actual input to the microphone, which contains x(n) and the feedback signal y(n) generated by the HA output s(n) passing through F (z). ŷ(n) is the estimate of y(n) given by the output of W (z). e(n) = d(n) − ŷ(n) is the feedbackcompensated signal which, ideally, should be identical to x(n).In practice, however, the AFC is not perfect and therefore ŷ(n) = y(n), resulting in distortion between e(n) and x(n).
Copy of W (z) In our system we adopt the PEM based AFC framework [6] where a time-varying pre-filter A(z) is present and adapted using linear prediction of e(n) [20].We also employ a bandlimited filer H(z) from the FXLMS approach to concentrate on the frequency regions where oscillation is likely to occur [5].This filter can also be viewed as a very rough approximation of the feedback path in the frequency domain [4].
In this framework, the LMS for coefficient adaptation is carried out in the pre-filtered signal domain, where the prefiltered signal u f (n) and e f (n) are used to update the L-tap where ] T and a time-varying step size µ(n) is usually employed to improve the convergence rate: with µ > 0 the step size parameter, a small positive constant to prevent division by zero, and the power estimate term with a forgetting factor 0 < ρ ≤ 1.This is actually the (modified) normalized LMS (NLMS) [21] and has been widely used in speech processing especially for AFC in HAs [5], [6], [22].

III. PROPORTIONATE ALGORITHMS FOR AFC
To incorporate the proportionate adaptation idea into the AFC, the update rule (1) becomes: where is an L-by-L diagonal matrix assigning different weights to the step sizes for different filter taps.We refer to it as the "proportionate matrix".Applying the PNLMS [11], at the n-th iteration the diagonal entries of P(n) are computed as: for l = 0, 1, ..., L − 1 where with positive constants κ and ζ.
When using the IPNLMS [12], we have: Substituting ( 7) into (5) gives: where a small constant δ > 0 is added to prevent division by zero and −1 ≤ α ≤ 1 is a parameter that can be chosen for different sparsity levels: When α = 1 it behaves like the PNLMS while for α = −1 it reduces to the NLMS.When using the IPNLMS-l 0 [13], the l 0 norm approximation is used to replace the l 1 norm in the IPNLMS.This gives: which results in: with another parameter β that provides control for identifying systems with different degrees of sparsity.
The PNLMS is more suitable for very sparse systems.The IPNLMS and IPNLMS-l 0 have parameters (α and β) for fitting different sparsity degrees but without direct connections to the system sparsity levels.Moreover, the formulations of these algorithms do not have theoretical foundation.

IV. PROPOSED SPARSITY PROMOTING LMS ALGORITHM
In this section we present the derivation of the proposed AFC algorithm that promotes sparsity.Let E[•] denote the expectation operator.In the pre-filtered signal domain of AFC, the ordinary LMS considers the following mean squared error (MSE) minimization problem: where To enforce sparsity on the solution w, we modify (12) as: (13) where the l p norm diversity measure w is added with a regularization parameter γ > 0 to promote sparsity in the solution w [17], [18].From [19], the gradient of the l p norm term w.r.t.w is given by: where Π(w) = diag(|w l | p−2 ).Applying this to the gradient of ( 13) and setting to zero, we obtain the following condition for the optimal solution w o : where γ γp 2 and I denotes the identity matrix.This suggests the following iterative procedure for computing w o : where we have defined: and auxiliary variables: At the n-th iteration, for w ∈ R L , we define the corresponding affinely scaled variable as [19]: With this transformation into the affine scaling domain, we have at the n-th iteration: q(n + 1) ≡ q(w(n Using (20), the update rule ( 16) can be equivalent to: Observing that ( 21) is actually the minimizer of the following quadratic objective function: we can derive the steepest descent algorithm as: where µ > 0 is the step size parameter.
Transforming (23) back to the original coefficient domain using (19) and ( 20) we obtain: Finally, to derive the adaptive algorithm, we replace R and b in (24) by their corresponding instantaneous values . This gives: 17).This is actually the pLMS algorithm proposed in [17].Note that M 2 (n) has a similar role as the proportionate matrix P(n) in (3).In fact, it indicates that the weight assigned to each step size is in proportion to |w l (n)| 2−p .Based on this and the previous discussion on proportionate adaptation for AFC, we propose the following update rule: where µ(n) is the normalized step size as (2), , for l = 0, 1, ..., L − 1 as ( 5), and: where p ∈ (0, 2] and c > 0 is a small positive constant.Several observations can be made here.First, comparing (26) to (25), we have set γ to zero as it has negligible effect as long as we keep it small [18].In such case, due to the presence of P(n), the algorithm will still exploit, though not enforce, the sparsity of the solution if it already exists.We then refer to it as Sparsity promoting LMS (SLMS).However, a practical issue arises when γ = 0 for (25): Once any of the AFC filter taps becomes 0, it will not get updated anymore.We therefore suggest adding a small constant c > 0 to all the taps as in (27) before they are used to compute P(n).
The parameter p is much more influential: When p is smaller (close to or less than 1), the algorithm becomes more sparsity promoting due to the nature of the l p norm; while when p = 2, it reduces to the NLMS.This indicates that a sparse system would benefit more from a smaller p while for a dispersive system, p close to 2 would be more preferable.From this it becomes clearer the advantage of incorporating the l p diversity measure.For the quasi-sparse feedback IRs in AFC, we expect that the optimal p value would lie between 1 and 2.

V. SIMULATION RESULTS
The above algorithms are evaluated using computer simulations in MATLAB at a sampling rate of 16 kHz.The experimental set up was as follows: The HA processing G(z) = gz −d with g = 20 and d corresponding to a delay of 8 ms.The feedback path IRs were measured using a behindthe-ear HA with open fitting on a dummy head and truncated to a length of 263 samples as shown in Fig. 2. The AFC filter length was L = 100 to cover the significant part of the IRs.The forgetting factor ρ = 0.985.The step size parameter µ = 0.005.Small positive constants = δ = c = 10 −6 .The band-limited filter H(z) = 1 − 1.8z −1 + 0.81z −2 as used in [4].The pre-filter A(z) was an FIR filter of order 20 updated every 10 ms via linear prediction of e(n) [20].
For the purpose of evaluation, the following metrics were used.First, to measure the distortion between e(n) and x(n), the hearing-aid speech quality index (HASQI) was used [23].
The HASQI score ranges from 0 to 1, where the higher the score, the better the quality.For evaluating convergence performance the normalized misalignment was considered: where F (e jω ) and F (e jω ) are the frequency responses of the measured and estimated feedback IRs, respectively.Note that the estimated feedback response is given by the band-limited filter and the AFC filter together as F (e jω ) = H(e jω )W (e jω ).
For estimating the ASG we [24]: In our first experiment we examine the effect of p in the SLMS on channels with different sparsity levels.In addition to the measured feedback path f 1 we considered two other artificial channels as plotted in Fig. 3.The input was a stationary speech-shaped noise.Fig. 4 shows the convergence behavior in terms of misalignment.We can see that for the feedback path case, p = 1.5 outperforms other values.For the sparser system, a smaller p around 1.2 is more preferable; while p around 1.8 gives the best performance for the dispersive one.These results show that the SLMS is exploiting the underlying system structure in the way we expect.Furthermore, in Fig. 4 (a) we can see that the SLMS (with a good choice of p) can improve the convergence rate over the NLMS and still maintains low steady-state error, which is not achievable by only using a larger step size parameter µ for the NLMS.
In the next experiment, we ran the AFC system with the SLMS on 25 male and 25 female speech signals from TIMIT database and measured the corresponding HASQI of the feedback-compensated signal e(n).Average HASQI scores over the 50 speech files for different values of p are shown in Fig. 5.We can see that the optimal p almost lies in the same range even as the feedback IR differs.This means, for a given HA device, if we have some rough knowledge about the sparsity degree of its feedback channel, the SLMS is robust since p is not very sensitive near the optimal point.From the results we found p around 1.5 to be a good choice, which also corresponds to the result we had in Fig. 4 (a).We also compare the SLMS with other proportionate algorithms for the AFC.The parameter settings were − PNLMS: κ = 0.1 and ζ = 0.01; IPNLMS: α = 0; IPNLMS-l 0 : α = 0 and β = 150.For the SLMS we used p = 1.5.The NLMS (which is equivalent to IPNLMS and IPNLMS-l 0 with α = 0 and SLMS with p = 2) is also compared.Fig. 6 compares the tracking performance in terms of misalignment and ASG with speech-shaped noise input.To model a highly time-varying feedback back environment, the feedback path was initially f 1 , switching to f 2 then f 3 at the 1/3 and 2/3 of the input sequence, respectively.We can see that the proportionate algorithms basically have faster convergence speed compared to the NLMS.Among all, the SLMS shows the best convergence behavior and can provide up to about 5 dB additional ASG compared to the NLMS.
Finally, for further verification, we ran the algorithms on the speech dataset and measured the average HASQI under 4 different feedback scenarios as shown in Fig. 7.We see that the SLMS outperforms all the other ones, especially obvious under an adverse feedback situation such as the last two cases (about 0.25 HASQI improvement compared to the NLMS in the last case).The first three cases were fixed environments with f 1 , f 2 , and f 3 .The last case f 123 was the feedback path changing from f 1 to f 2 then f 3 at 1/3 and 2/3 of the input sequence, respectively.
VI. CONCLUSION In this paper we introduce the SLMS for AFC to exploit sparsity in estimating the feedback path.This approach extends the LMS algorithm by incorporating the l p diversity measure in the objective function.We derive update rules and discuss guidelines for choosing the system parameters.We present simulation results with speech-shaped stimulus and speech segments with real-world feedback paths, including conditions where the feedback path changes over time.The results show that for the SLMS (i) choice of p is not very sensitive around the optimal point; (ii) HASQI improvement of about 0.25 over the NLMS; (iii) ASG improvement by 5 dB compared to the NLMS; and (iv) better performances compared with other proportionate-type LMS algorithms.

Fig. 2 .
Fig. 2. Measured acoustic feedback path IRs of (a) f 1 : no obstruction, (b) f 2 : with a cellphone close to the ear, and (c) f 3 : with a cellphone right on the ear.

Fig. 7 .
Fig. 7. Comparison of speech quality for different feedback environments.The first three cases were fixed environments with f 1 , f 2 , and f 3 .The last case f 123 was the feedback path changing from f 1 to f 2 then f 3 at 1/3 and 2/3 of the input sequence, respectively.