Exploiting Spatial–Temporal Joint Sparsity for Underwater Acoustic Multiple-Input–Multiple-Output Communications

Multiple-input–multiple-output (MIMO) system offers a promising way for high data rate communication over bandwidth-limited underwater acoustic channels. However, MIMO communication not only suffers from intersymbol interference, but also introduces the additional co-channel interference, which brings challenge for underwater acoustic MIMO channel estimation and for channel equalization. In this article, we propose novel interference cancellation (IC) methods for handling this co-channel interference problem in the design of both channel estimation and channel equalization. Our method for channel estimation utilizes the spatial joint sparsity and the temporal joint sparsity in the multipath structure to estimate sparse channels with common delays under distributed compressed sensing framework. In this way, we enhance channel estimates with common delays, thus, suppress co-channel interference. Meanwhile, to address the case of multipath arrivals with different delays, which are estimated as noise under simultaneous orthogonal matching pursuit (SOMP) algorithm, we introduce forward–reverse strategy to SOMP algorithm, which is referred to as the FRSOMP algorithm. Our proposed FRSOMP algorithm performs the SOMP algorithm to achieve the initial channel estimates, performs the forward-add process, which attempts to add promising candidates into support sets, and performs the reverse-fetch process to check if the candidates in the support set are retained or removed. The purpose of channel estimation is to directly calculate the filter coefficients for channel-estimation-based decision feedback equalization (CE-DFE). In this article, we also propose a novel CE-DFE receiver with IC component. We design IC filters based on the traditional CE-DFE, and we derive the coefficients of the feedforward filters, feedback filters, and IC filters based on the channel estimate metric obtained by the FRSOMP algorithm, so the co-channel interference will be suppressed both in channel estimation and channel equalization. We demonstrate the performance of our approach by numerical simulation, lake experiment, and sea experiment. Results are provided to demonstrate the effectiveness of the proposed methods, which show that the proposed methods obtain higher output signal-to-noise ratio, lower bit error rate, and more separated constellations compared with the traditional compressed sensing channel estimation method and the traditional CE-DFE method.


I. INTRODUCTION
M ULTIPLE-INPUT-MULTIPLE-OUTPUT (MIMO) system has the potential to improve the channel capacity significantly in limited bandwidth underwater acoustic communications. Specifically, since the channel spatial diversity is high, the capacity increase for even small MIMO configuration is significant. Compared with the wireless channel, the underwater acoustic channel is much more challenging for MIMO technology. Typically, the underwater acoustic channel is characterized by long time delay spread which causes intersymbol interference (ISI), and by time variations. Moreover, multiple transmitters improve the data rate tremendously, but they lead to severe co-channel interference (CoI), which introduces large noise in the per-channel demodulation process. All of these pose great difficulties for underwater acoustic MIMO channel estimation and channel equalization.
In this article, we suggest to combat CoI by utilizing the temporal and spatial joint sparsity of the underwater acoustic MIMO channels. Because of the sea surface and sea bottom reflection, usually the underwater acoustic channel has a couple of significant taps over a large delay spread, i.e., the underwater acoustic channel is typically sparse. This sparsity can be exploited by employing the compressed sensing (CS) method to improve underwater acoustic channel estimation performance [1]- [11]. However, the performance of the traditional CS channel estimation algorithms such as matching pursuit (MP) and orthogonal matching pursuit (OMP) in the existing literature listed above will be decreased for underwater acoustic MIMO communication, because of the strong CoI.
In contrast to conventional CS theory that reconstructs single sparse signal alone, the distributed compressed sensing (DCS) method exploits the sparsity correlation among multiple observed signals to enhance sparse data recovery [12]. DCS-based channel estimation has been successfully used in radio wireless communication [13]- [16]. For example, in [16], DCS-based channel estimation was adopted in MIMO-OFDM systems. Simulation results showed that DCS-based MIMO-OFDM could improve both bandwidth efficiency and bit error rate (BER) performance. In [14], multiuser massive MIMO systems exhibited a joint sparsity structure in the user channel matrices, and a joint OMP recovery algorithm was proposed to perform the channel-state-information-at-transmitter estimation.
Limited investigations of the DCS technique have been reported for underwater acoustic channel estimation. In [2], a DCS-based channel estimation algorithm was utilized to suppress CoI. Adjacent data blocks in time domain were combined to simultaneously recover underwater acoustic channels using the simultaneous orthogonal matching pursuit (SOMP) algorithm. In [3], DCS was proposed to reduce the number of pilot subcarriers in underwater acoustic OFDM channels. Two adjacent data blocks in the frequency domain were combined to obtain joint sparsity channel estimation using the SOMP algorithm. In [4], DCS was used for underwater acoustic multiband transmissions. The traditional SOMP algorithm was used to enhance channel estimates that have common delays, and the proposed multiple-selection-strategy-based SOMP algorithm was utilized to reconstruct channel taps that have different delays. So far, the DCS method has not been extensively investigated in underwater acoustic MIMO channel estimation.
In MIMO communication systems, because the transmitting and receiving antennas are collocated, the propagation delay is roughly the same for all transmit-receive antenna pairs. As a result, the locations of the significant arrivals in all channels are nearly identical [16]. This identity can be utilized through a DCS channel estimation scheme. For a typical underwater MIMO communication system, because the communication range is far larger than the receiver aperture, the channels from different transmitters to a receiver may exhibit correlation, namely, some channel taps have the same channel delays, but with different coefficients. Furthermore, if underwater acoustic channels are time invariant or slow time varying, channels from adjacent data blocks also show strong correlation. The DCS-based channel estimation method is able to utilize such correlation to enhance underwater acoustic MIMO channel estimation performance. In other words, channel estimates that have common delays are enhanced, which result in suppressing the CoI. The SOMP algorithm was directly utilized to solve DCS problems [2], [3], [13]- [16]. However, channels that come from different transmitters to a receiver may still have different arrival structures, and the different arrival structures will be estimated as noise when the SOMP algorithm is used. In this article, we use the DCS method to estimate the underwater acoustic MIMO channel. This has the potential to suppress the CoI. We not only consider to enhance the channel estimates that have common delays, but also consider to reconstruct the channel estimates that have different delays, so that the performance of underwater acoustic MIMO channel estimation will be improved. Different from the work done in [2] and [3] which only used the SOMP algorithm to temporal joint sparsity, and different from the work done in [4] which used the SOMP and multiple-selection-strategy-based SOMP algorithm to temporal and subband joint sparsity in multiband communication system, in this article, the proposed channel estimation method explores both temporal and spatial joint sparsity. Channel taps that have different delays in [2] and [3] are treated as noise. While in [4], the multiple selection strategy SOMP was proposed to address them, in this article, the forward-reverse SOMP (FRSOMP) algorithm is proposed to deal with the channel taps that have different delays.
Equalization including time domain equalization and frequency domain equalization is intensively investigated to eliminate the ISI and frequency-selective distortion. Frequency domain channel equalization usually has lower computational complexity than time domain equalization, thus, the frequency domain orthogonal frequency-division multiplexing (OFDM) approach can be adopted for MIMO underwater acoustic communication purposes [8], [17], [18]. However, because of the high peak-to-average-power ratio, OFDM systems are generally not preferable from an amplifier efficiency point of view. Moreover, the OFDM systems are very sensitive to Doppler shift and phase noise.
For single-carrier underwater acoustic communications, the common channel equalization approaches in time domain are the direct adaptive decision feedback equalization (DA-DFE) [5], [19]- [23] and the channel-estimation-based decision feedback equalization (CE-DFE) [6], [24]- [27]. The DA-DFE approach uses adaptive algorithm (such as the recursive least square algorithm) to equalize received symbols. The merit of DA-DFE is not to estimate the channel explicitly, but the DA-DFE approach requires long training sequences to achieve convergence. In contrast, the CE-DFE approach directly measures the channel via shorter training sequences and calculates the channel equalizer's filters based on channel estimates. DA-DFE was intensively used to equalize underwater acoustic SIMO or MIMO channels. In [5], [19]- [21], DA-DFE was followed by a time reversal operation to further eliminate the ISI for underwater acoustic SIMO communication. In [22] and [23], DA-DFE was utilized for underwater MIMO equalization. In [22], two-stage time-reversalbased DA-DFE was utilized. At the first stage, time-reversalbased DA-DFE was applied to get initial demodulation results; at the second stage, time-reversal-based DA-DFE was applied again after suppressing parallel interference cancellation (IC). In [23], DA-DFE combined with IC was applied to underwater acoustic MIMO communications. The CE-DFE also has been used in underwater acoustic SIMO communication, and in [6] and [24]- [27], CE-DFE was utilized for underwater acoustic SIMO equalization, and the feedforward filters and feedback filter were derived based on channel estimates. The traditional CE-DFE in [6] and [24]- [27] is limited for underwater acoustic MIMO communication because of the serious CoI. So far, few works investigated CE-DFE with IC in underwater acoustic MIMO communications.
Another type of equalization technology is turbo equalization. In [11], [23], [28], and [29], a turbo equalization method was used for underwater MIMO communication with soft IC. In [23], an iterative MIMO decision feedback equalizer with successive IC for both space-time trellis codes and layered space-time codes was proposed in turbo system. In [11], an enhanced linear minimum mean square error turbo equalization scheme for a spatially multiplexed MIMO system with frequency-selective fading was introduced. In [28], turbo block decision feedback equalization, where the soft decision equalizer performed successive soft IC of both ISI and CoI, was proposed for high data rate single-carrier MIMO underwater acoustic communications. In [29], a low-complexity frequency-domain turbo equalizer combined with phase rotation compensation and soft successive IC was adopted for single-carrier MIMO underwater acoustic communication. Different with classic channel equalizer, the turbo equalizer needs to iteratively perform equalization and decoding, which unfortunately costs significant computational complexity and decreases the raw data rate. Moreover, the structure of the turbo equalizer is complicated.
Because the DCS method needs less observation data to recover signals, it is successfully used in a wireless network [13]- [16], but it has not been widely used in underwater acoustic channel estimation. In previous research, DCS has been used to improve the channel estimation performance based on the joint sparsity model 2 (JSM2), and the taps that have common delays have been improved. However, the different delays have not been considered which degrades the channel estimation performance. Moreover, the traditional channel estimation methods such as least square and OMP algorithms suffer from strong CoI when they are used in underwater acoustic MIMO communication systems. Traditional CE-DFE requires fewer training sequences to achieve the coefficients of feedforward filters and feedback filter, and the traditional CE-DFE needs fewer controlling parameters compared with DA-DFE, which is convenient for underwater acoustic SIMO communication [6], [24]- [27], but the traditional CE-DFE is limited for underwater acoustic MIMO communication because of the critical CoI.
In this article, our solution heavily relies on the concept of IC both in underwater acoustic MIMO channel estimation and channel equalization. We use the concept of DCS to enhance the accuracy of the channel estimation for underwater acoustic MIMO systems. In this process, we utilize both the spatial joint sparsity and the temporal joint sparsity to reduce CoI for channel taps with common delays. In order to address the channel taps with different delays, we design a novel method that combines the forward-reverse strategy with the SOMP algorithm, thus, the different arrivals have the potential to be reconstructed correctly. The derived algorithm is referred to as FRSOMP. To further suppress the CoI in the channel equalizer, we propose a CE-DFE receiver which contains IC component for MIMO channel equalization. The coefficients of feedforward filters, feedback filters, and IC filters are calculated based on channel estimates. Our proposed CE-DFE with IC needs fewer training symbols to achieve convergence, and directly eliminates the CoI using IC filters. Finally, numerical simulation, a lake experiment, and a sea experiment demonstrate the effectiveness of the proposed methods.
The contribution of this work is twofold. 1) We propose the FRSOMP strategy to the SOMP channel estimation algorithm, so that channel estimates with common delays have the capability to suppress the CoI, and channel estimates with different delays are reconstructed correctly, while in the traditional SOMP channel estimation, the channel estimates with different delays are treated as noise. Thus, the performance of MIMO channel estimation is improved. 2) We propose CE-DFE with IC component for underwater acoustic MIMO equalization, thus the CoI is suppressed further. We derive the coefficients for feedforward filters, feedback filters, and IC filters based on channel estimates. So, the performance of channel equalization is improved. The rest of this article is organized as follows. Section II describes the channel model for an underwater acoustic MIMO system. Section III outlines DCS-based channel estimation and our proposed FRSOMP channel estimation for underwater acoustic MIMO communications. Section IV derives our proposed CE-DFE receiver that contains the IC component. Numerical simulation results, lake experimental results, and sea experimental results are described and analyzed in Section V. Finally, this article concludes in Section VI.
The following notations are used in this article. Bold upper case and lower case letters denote matrices and column vectors, respectively. Superscripts (·) * , (·) T , (·) H , and (·) † denote the conjugation, transpose, Hermitian transpose, and pseudoinversion operations, respectively. 0 P ×L and I denote the zero matrix with P rows and L columns, and the identity matrix, respectively. · 0 and · 2 denote l 0 -norm and l 2 -norm, respectively. Matrix A[j] denotes a submatrix obtained from jth columns of A. a b denotes the dot product of a and b. a, b denotes the inner product of a and b. E(·) denotes the expectation operation. ∪ and ∅ denote the union operation and the empty set, respectively.

II. CHANNEL MODEL FOR UNDERWATER ACOUSTIC MIMO SYSTEM
We consider a MIMO acoustic communication system that has a number of M transmitters and a number of N receivers. The equation of a discrete baseband receiver at the nth hydrophone kth data block can be written as where y n (k, i) and w n (k, i) are the received signal and ambient noise at the nth hydrophone and the kth data block, respectively. Notation x m (k, i) is the transmitted symbols from the mth transmitter and the kth data block. Notation h m,n (k, i) is the channel impulse response from the mth transmitter to the nth receiver at the kth data block which is defined by where α m,n (k, l) and τ m,n (t, l) are denoted as amplitude and delay for the lth path, respectively. Notations L and i are the length of channel impulse response and time index for observation time, respectively. It is assumed that the acoustic communication channel is time invariant within a data block (noted as P samples), then the following equation can be expressed as: where A m (k) is a Toeplitz-type matrix defined as (4), shown at the bottom of the page. Notations y n (k), h m,n (k), and w n (k) are defined as The MIMO channel impulse response h m,n (k) can be estimated by the least square algorithm based on (3). The underwater acoustic channel typically exhibits sparse, and the Toeplitz-type matrix A m (k) satisfies the restricted isometry property [30], so the CS method can be directly utilized to estimate the underwater acoustic sparse channel. Because of strong CoI, the performance of the traditional CS-based channel estimation algorithm for underwater acoustic MIMO channels will be decreased.

III. DCS FOR UNDERWATER ACOUSTIC MIMO CHANNEL ESTIMATION
In this section, we introduce the SOMP-based DCS, and then derive our proposed FRSOMP algorithm.

A. Distributed Compressed Sensing
For underwater acoustic MIMO systems, if channels from different transmitters to a receiver and channels from adjacent data blocks exhibit correlation, they can be modeled by JSM2 in DCS theory [12]. In JSM2, all the channels share common support set, namely, all the channels have the same tap delays, but different coefficients. Fig. 1(a) shows an example of joint sparsity. The two channel impulse responses share a common support set, but the taps are different. The SOMP algorithm can directly solve the JSM2 problem [2], [3], [13]- [16].
The way that utilizes multiple data blocks to find the same channel delays is referred to as joint sparsity estimation. Fig. 1(b) illustrates spatial joint sparsity and temporal joint sparsity channel estimation. The sparse channel delays can be estimated by multiple data blocks that come from different transmitters and come from adjacent data blocks shown in Fig. 1 channel coefficients are measured individually based on the channel delays. This is the core principle of the SOMP algorithm. If channel delays are estimated by the data blocks that come from spatial domain, which are marked with a dotted rectangle in Fig. 1(b), we denote it as spatial joint sparsity simultaneous orthogonal matching pursuit (S-SOMP) channel estimation. If channel delays are estimated by both spatial domain and temporal domain, which are marked with a dotted rectangle and a dotted ellipse, we denote it as spatial joint sparsity and temporal joint sparsity simultaneous orthogonal matching pursuit (ST-SOMP) channel estimation. Considering multiple received data blocks, y n (1), . . . , y n (Q), and their corresponding measurement matrices, , the purpose of the DCS method is to reconstruct the number of MQ channel impulse responses simultaneously. Based on the DCS theory, we establish the following model [2], [16].
In (8a), shown at the bottom of the page, notation y m,n is the received signal from the mth transmitter to the nth receiver. Similarly, ω m,n is the noise from the mth transmitter to the nth receiver. To simplify the equation, we eliminate the receiving hydrophone index n, and we adopt one subscript instead of two subscripts in notations. Equation (8b), shown at the bottom of the page, is the mapping of (8a). In fact, y m,n and ω m,n cannot be obtained directly, so they are replaced by y n and ω n , respectively, for all m = 1, . . . , M. In (8a), comprehensive temporal joint sparsity and spatial joint sparsity with a dimension of MQ are considered. The advantage of (8a) is that multiple data blocks can be used to improve detection for the common tap delays, while in the OMP algorithm, only one data block is used. So DCS has more potential to improve the underwater acoustic MIMO channel estimation performance. Based on the DCS theory, underwater acoustic MIMO channel estimation can be formulated as the following optimization problem [4], [31], [32]: Algorithm 1: The SOMP Channel Estimation Algorithm. Input: y k , A k , k = 1, . . . , MQ. The sparsity S. 1: Select the candidate λ in (10).

5:
Measure the individual channel coefficients h k . 6: Measure the individual residual u s k . end for The constraint condition can be described as follows: all the channels contain S multipath arrivals with common delays, while the sparsity of each channel may be different. The SOMP algorithm can be directly utilized to solve (9). The key idea of the SOMP algorithm is shown in The candidate is measured via multiple data blocks, while in the traditional OMP algorithm, the candidate is measured by only one data block. Because of the strong CoI in MIMO channel estimation, the correlation between received signal y k and its corresponding measurement matrix A k will be degraded, thus the capability of searching candidates will decrease. However, if channel taps have the same delays, the term MQ k=1 | A k [l], u s k | will have larger amplitude, thus the CoI will be suppressed, so that the tap delays are easy and correct to be selected. If the number of data block k = 1, the SOMP algorithm will reduce to a traditional OMP algorithm. The SOMP algorithm is described in Algorithm 1 [32].

B. FRSOMP Algorithm
We introduce FRSOMP in this section. It is assumed that all the channel taps have the same delays in JSM2. In fact, channel taps that come from different transmitters to a receiver may contain different delays. Under JSM2, the different delays will be estimated as noise, which leads to worse channel equalization performance. In [33], a new mixed support set signal model (MSSM) with a shared structure where the signal vector consists of two parts was introduced. The channels contain two parts In (11), the superscripts (c) and (d) denote the common part and the different part, respectively. Under JSM2, the common part will be enhanced by the SOMP channel estimation algorithm, so it has the potential to suppress the CoI. On the other hand, the different part will be estimated as noise. Previous works [2], [3], and [13]- [16] were investigated based on JSM2. In this article, the MSSM is considered, so that the different part will be reconstructed correctly.
In the traditional OMP and SOMP algorithms, the candidates were carefully selected index-by-index; on the other hand, the selected candidates in support set will remain forever, which means that if the candidates are not correctly chosen, the error will propagate to the next iteration. In this article, we propose the FRSOMP algorithm for the underwater acoustic MIMO channel estimation. The forward-reverse strategy includes forward-add and reverse-fetch processes. The serial add strategy of including a potential candidate into the support sets is referred to as forward-add. After the forward-add strategy is performed, it is natural to include a reliability testing strategy. Fetch-reverse is a testing scheme where the s most prominent candidates are chosen from (s + 1) candidates in support sets. Since the initial channel delays are determined by the SOMP algorithm, the channel taps that have common delays are enhanced. By using the reverse-fetch testing scheme, the incorrect different delays are removed, and the potential different delays are added to the support sets and correctly reconstructed. The FRSOMP algorithm has the following steps.
1) The SOMP algorithm is performed to achieve the initial channel estimatesh k by Algorithm 1, thus, estimation for common delays is enhanced which can suppress the CoI. Meanwhile, the different delays are estimated as noise. We use the forward-reverse strategy to remove the erroneous tap delays and introduce correct tap delays in the following steps. 2) Generate the corresponding residual vectors for the reliability testing. The initial support setsΓ k and the initial residual u s k are measured based on the initial channel estimates shown iñ Superscript s is the iteration index, where s = 1, . . . , S, and subscript k is the data block index, where k = 1, . . . , MQ. In this step, the number of S × MQ support sets and the number of S × MQ residuals are stored (s = 1, . . . , S). The purpose of the residual is to provide the initial residual which is used for comparison in step 4. In (12a), we select the first s maximum coefficients, and save their corresponding positions intoΓ k . Note that in (12a), notation |Γ k | = s represents that the cardinality ofΓ k is s; in other words, there are a number of s nonzero elements inΓ k .
3) The forward-add process: A promising candidate is chosen by the DCS method in and attempts to add into the support sets in so the cardinality in Γ s+1 k is larger than in Γ s k by one, which vividly describes the forward-add process. It shows that, in (13a), the candidate λ max is determined by multiple data blocks, and the residual is measured individually, based on the selected support set in If the added promising candidate is the desired one, it will remain in the support sets, otherwise, it will be directly removed by the reverse-fetch strategy in step 4. 4) The reverse-fetch process: The reverse-fetch strategy is a testing method which is used to test whether the previous added candidates are available. Based on the support sets in the forward-add process, the new channel coefficients h k are measured, as shown in For clarity, we denote u s k , u s+1 k , andũ k as current residual, intermediate residual, and temporary residual, respectively. Analogous to residual, we denote Γ s k , Γ s+1 k , and Γ k as the current support sets, intermediate support sets, and temporary support sets, respectively. The temporary support setsΓ k are formed through the tap delays that have first s maximum coefficients, as shown iñ and the temporary residual is measured by the temporary support sets, as shown iñ Note that there are a number of (s + 1) nonzero elements in h k given in (14a)-(14c). Considering the residual norm as the model fit measure, for each k, where k = 1, . . . , MQ, a comparison between the current residual norm u s k 2 and the temporary residual norm ũ k 2 is performed. For the comparison, if the temporary residual norm ũ k 2 is smaller than the current residual norm ũ k 2 , it means that there exists incorrect candidates in the current support sets Γ s k . Then the temporary support setsΓ k act as the new current support sets Γ s k , and the current residual u s k is replaced by the temporary residualũ k . After comparing all the residuals, the iteration counter is decreased by one and continues the reverse operation of refining the support sets, which is why it is called the reverse-fetch process. If the current residual norm is smaller than the temporary norm, the reverse operation is not performed. By the reverse-fetch process, the channel taps with common delays will remain, and the channel taps with different delays which are incorrectly estimated by the SOMP will be replaced by the correct taps gradually. 5) Finally, the channel estimates are measured based on the final support sets Γ s k where s = Ŝ The main difference between the traditional sparse channel estimation methods and the proposed FRSOMP method is that the proposed FRSOMP has the capability to suppress the CoI. The main difference between the SOMP algorithm and our proposed FRSOMP algorithm is that the proposed FRSOMP algorithm attempts to add promising candidates into support sets by the forward-add process, and removes the incorrect candidates by the reverse-fetch process, whereas in the SOMP algorithm, the incorrect candidates remain in the support set forever. The channel estimates with common delays are enhanced, which have the capability to suppress the CoI, and the channel estimates with different delays are recovered correctly under our proposed FRSOMP algorithm. The pseudocode for our proposed FRSOMP is shown in Algorithm 2.
If the spatial joint sparsity is used by FRSOMP, the channel estimation algorithm is referred to as S-FRSOMP; if both the spatial joint sparsity and the temporal joint sparsity are used by FRSOMP, the channel estimation algorithm is referred to as ST-FRSOMP.

C. Computational Complexity Analysis
In this section, we roughly compare the computational complexities for the traditional OMP, SOMP, and our proposed FRSOMP algorithm. The complexity of the OMP algorithm at the sth iteration is O(P L + P s + P s 2 + s 3 ) [34]. Consider that the sparsity of the underwater acoustic channel is limited, namely the maximum value of s is quite small, so P L (P s + P s 2 + s 3 ); thus, the dominating term P L is chosen for analysis convenience [35]. The total computational complexity of an OMP algorithm is O(SP L), where S is the sparsity. Table I shows the computational complexities of OMP, SOMP, and FRSOMP for each channel estimate. The SOMP algorithm has the same computational complexity as the OMP. For the FRSOMP algorithm, because the total iteration times cannot be exactly determined, we simply provide the lower and the upper bound. If all the channels have common tap delays, the minimum computational complexity is O((S + 1)P L). If all Measure the first s maximum candidates inh k , and store their positionsΓ k via (12a).

19:
Output: the channels do not have common tap delays, the maximum computational complexity is O(3(S + 1)P L). From Table I, we can see that the proposed FRSOMP obtains better channel estimation performance at the cost of computational complexity.

IV. CE-DFE WITH IC
In this section, we derive the CE-DFE with IC component. Fig. 2 shows the structure of the CE-DFE with IC which has a very similar structure to the self-iterative equalization solution in [23]. The biggest difference between our receiver and the selfiterative equalization receiver is that the proposed receiver utilizes channel estimate to measure the filter coefficients directly, as shown in Fig. 2, while the self-iterative equalization receiver uses the adaptive algorithm to calculate the filter coefficients iteratively. The equalization process is performed symbolwise, where in the following, we omit the block index. We consider the underwater acoustic MIMO channel model that has the number of M transmitters and the number of N receivers. At the nth receiver, the received baseband signal is given as where y n = (y n (i + N c ) · · · y n (i) · · · y n (i − Na + 1)) T (17) In (17)- (19), N a and N c denote the number of acausal and causal taps, respectively. Base on the model in Fig. 2, the soft decision can be depicted as where p m and v m are as follows: Fig. 2, the feedforward filters and the feedback filters are denoted as f m and b m , respectively, and the IC filters, which are used to mitigate the signal from the kth transmitter, are denoted as r m,k , where k = m. The lengths of the feedforward filters, feedback filters, and IC filters are denoted as N f , N b , and N r , respectively. Compared with traditional CE-DFE in [25], our proposed CE-DFE with IC in (22) contains additional terms r H m,k x b k , whose purpose is to mitigate the CoI caused by the received signal from the kth transmitter.
Similar to the work done in [36], we partition the transmitted symbols x m into three groups Similarly, we partition the channel matrix H m,n into three parts B m , q m , and C m , as shown in (21). So (16) can be rewritten as The first term in (24) contains the transmitted symbol x m (i), which needs to be recovered. The second term can be canceled by feedback filters. The third term, which contains transmitted symbols from other transmitters, is treated as CoI which should be canceled by feedforward filters. The feedforward filters try to mitigate the last term which is treated as observation noise. In order to eliminate the transmitter symbols from other transmitters, we design a channel-based decision feedback equalizer with IC. Based on the traditional CE-DFE, we add IC filters as shown in Fig. 2, so the fourth term in (24) can be canceled by the IC filters, while in traditional CE-DFE, the third term and the fourth term in (24) are ignored.
Consider the transmitted symbols x m to be zero mean, white sequences with variance σI, then the transmitted symbols are independent with channel impulse response and received noise u m . Also consider the received noise u m to be zero mean with covariance D u m , which is independent with channel impulse response. Then, we obtain the effective observation noise correlation matrix R m as The optimal equalization coefficients are the solution tô With the model defined in (26), a number of approaches can measure the optimal filter coefficients (f m , b m , and r m,k ). In this article, we adopt the linear minimum mean square error method to calculate the filter coefficients. The expressions of these optimal filter coefficients are (see Appendix A for further details) Compared with turbo equalizers in [11], [23], [28], and [29], the proposed CE-DFE with IC uses IC filters to directly remove the CoI, while in the turbo equalizer, the CoI is eliminated iteratively, which unfortunately increases the computation complexity. Also, the proposed CE-DFE with IC has a simpler structure shown in Fig. 2. Compared with DA-DFE, the proposed CE-DFE with IC directly utilizes channel estimates to calculate the filter coefficients, which needs fewer training symbols to achieve convergence, so that the bandwidth efficiency is improved. Also, the proposed CE-DFE with IC needs fewer controlling parameters, thus, CE-DFE with IC is more robust than DA-DFE.

V. EXPERIMENTAL ANALYSIS AND RESULTS
In this section, we present results from numerical simulations, lake experiments, and sea experiments. The performance is compared between our proposed CE-DFE (noted as CE-DFE with IC) and traditional CE-DFE (noted as CE-DFE without IC). The filter coefficients of the two kinds of CE-DFE are measured by the channel estimates which are obtained by the traditional channel estimation algorithms and our proposed FR-SOMP channel estimation algorithms. In the experiments, we use the following channel estimation algorithms.
3) S-SOMP: Only the spatial joint sparsity is used to estimate the underwater acoustic MIMO channels, which is the traditional SOMP algorithm. 4) S-FRSOMP: Our proposed FRSOMP channel estimation algorithm. Only the spatial joint sparsity is used in the FRSOMP algorithm. 5) ST-SOMP: Both the spatial joint sparsity and the temporal joint sparsity are used to estimate the underwater acoustic MIMO channels, which is the traditional SOMP algorithm. 6) ST-FRSOMP: Our proposed FRSOMP channel estimation algorithm. Both the spatial joint sparsity and the temporal joint sparsity are used in the FRSOMP algorithm. We use output SNR, BER, and constellation to evaluate the performance of our proposed FRSOMP channel estimation algorithms and our CE-DFE with IC methods. Define the output SNR as where x m is the transmitted symbol from the mth transmitter, and x s m is the soft output from the receiver. For description convenience, we denote the mth transmitter and the nth receiver as Txm and Rxn, respectively. From surface to bottom, channel which comes from the mth transmitter to the nth receiver is denoted as Txm-Rxn. In the simulation experiment, the lake experiment, and the sea experiment, the quadratic phase-shift keying (QPSK) mapping method is utilized.

A. Numerical Simulation
In this section, we use numerical simulation to demonstrate our channel estimation and channel equalization methods. In the simulation experiment, the underwater acoustic MIMO system contains three transmitters and eight receivers, with a range of 2000 m. The depth of sea is 20 m; the depth of the transmitters is 5, 10, and 15 m; the depth of the receivers varies from 2 to 16 m with uniform distribution. We use Bellhop model [37] to generate channels. For channels from Tx2 and Tx3, we replace the tap coefficients manually but keep the tap delays. The tap coefficients are generated manually and show exponential decay with decay factors from 0.012 to 0.019; the phases of tap coefficients are randomly distributed. To make sure that the channels which come from different transmitters to a receiver have three and four common tap delays, one or two taps are moved by hand. We select the first ten maximum amplitudes as channel multipath. Fig. 3 shows the channels from Tx1-Rx6, Tx2-Rx6, and Tx3-Rx6. It shows that the multipath delays reach about 35 ms, and show typical sparse. Moreover, some channel taps which come from different transmitters have the common delays, and their estimation performance will be improved by the DCS method.
We investigate our channel estimation and our channel equalization methods in terms of output SNR. The SNR for each  channel is set to 20 dB. The sparsity for OMP, S-SOMP, S-FRSOMP, ST-SOMP, and ST-FRSOMP is 10, in ST-SOMP and ST-FRSOMP algorithms, and the adjacent data blocks is set at Q = 2. The symbol rate for each transmitter is 6000 symbols/s. The channel length is set to 36.7 ms, which corresponds to 220 symbols. The lengths of feedforward filters, feedback filters, and IC filters are 440 symbols, 219 symbols, and 219 symbols, respectively.
In the simulation, CE-DFE with IC and CE-DFE without IC work in decision-directed mode. The length of signal frame is 10 s. At the beginning of signal frame, a preamble which is known to the receiver is designed for obtaining the initial channel estimates. Because the channels in the simulation are time invariant, channel estimates are not updated periodically. Fig. 4 shows the output SNR obtained from CE-DFE without IC. From the subfigures, we can see that, when the observation length is less than 91.7 ms, the output SNR increases with the observation length, because a longer observation length improves the channel estimation performance. Also, we can observe that our ST-FRSOMP achieves the highest output SNR among the three subfigures, because our ST-FRSOMP not only enhances the channel estimation which has common delays, but also reconstructs the taps which have different delays. Because of the strong CoI, the output SNR which is obtained by CE-DFE without IC converges at 5 dB. Fig. 5 shows the output SNR obtained by CE-DFE with IC. Different from Fig. 4, the output SNR increases with observation length, from 18.3 to 110 ms. Because CE-DFE with IC not only mitigates the second and last terms in (24), but also the third and fourth terms in (24) denoted as CoI, CE-DFE with IC obtains much higher output SNR than that obtained by CE-DFE without IC. It can be observed that our ST-FRSOMP channel estimation combined with our CE-DFE with IC achieve the highest output SNR. It is because both channel estimation and channel equalization have the ability to eliminate CoI.

B. Lake Experiment
The lake experiment was conducted at Black Warrior River, in the vicinity of The University of Alabama, Tuscaloosa, AL, USA. The depth of the lake is around 10 m. We tested an underwater acoustic MIMO system of two transmitters and six receivers. The transmitting array and the receiving array were mounted from a harbor dock, as shown in Fig. 6(a). The sixelement receiving array covered from 2 to 7 m with an element spacing of 1 m. The transmitter elements were deployed at 2.5 and 3.5 m, respectively. The distance between the transmitting array and the receiving array was roughly 110 m. Fig. 6(b) shows the sound-speed profile. We observe a constant sound speed of roughly 1484 m/s, across the depth of the lake.
For transmissions, a data frame of 6 s at the rate of 4250 symbol/s and a carrier frequency of 85 kHz were used. The average SNR among the six receivers was measured to be 35.6 dB. Examples of the channel impulse responses are given    Fig. 7(a) and (b) for Tx1-Rx3 and Tx2-Rx3, respectively. The channel length is set to 33.33 ms, the channel estimation algorithm is LSQR, and the observation length (100 ms) is three times the channel impulse response. From the two subfigures, we observe a delay spread of roughly 10 ms, and we can adopt the CS channel estimation to improve the underwater acoustic channel estimation. We can also observe that the multipath from the two subfigures contains rich common delays and different delays, which motivates the usage of our FRSOMP algorithm.
The parameters of the receiver are listed in Table II. The over-sampling factor K os is set to 1, and the length of channel impulse response T ch is set to 33.33 ms, which corresponds to L = 200 symbols. The lengths of feedforward filters, feedback filters, and IC filters are set to N f = 2L symbols, N b = L − 1 symbols, and N r = L − 1 symbols, respectively. In the OMP, S-SOMP, S-FRSOMP, ST-SOMP, and ST-FRSOMP algorithms, the sparsity is set to 30. In the ST-SOMP and ST-FRSOMP algorithms, the number of adjacent data blocks is set at Q = 2.
In the training mode, channel estimators assume perfect knowledge of transmitted symbols. Fig. 8 shows the output SNR in training mode for both Tx1 and Tx2. As expected, the output SNR increases with the observation length. Obviously, the LSQR channel estimation algorithm results in the worst performance, because LSQR is a nonsparsity channel estimation algorithm. Compared with OMP, S-SOMP, S-FRSOMP, ST-SOMP, and ST-FRSOMP channel estimation algorithms, we conclude that our proposed S-FRSOMP and ST-FRSOMP algorithms achieve higher output SNR than S-SOMP and ST-SOMP algorithms, respectively, because the taps with common delays are enhanced, so that they suffer from less CoI, and the taps with different delays are correctly reconstructed based on our proposed FRSOMP algorithm. For example, in Fig. 8(a), under the observation length of 28.23 ms, the S-FRSOMP and S-SOMP algorithms achieve 4.79 and 4.49 dB, respectively, and the ST-FRSOMP and ST-SOMP algorithms achieve 5.68 and 5.30 dB, respectively. In Fig. 8(b), the results are 4.38 and 4.06 dB and 5.29 and 4.86 dB. In general, the DCS algorithms achieve higher output SNR than OMP in Fig. 8, because the DCS channel estimators suppress the CoI by combining multiple data blocks to estimate taps with common delays.
Compared with Fig. 8(a) and (c), it can be observed that, in Fig. 8(c), the output SNR is not converged under long observation length. It still leads to slight performance improvement with the observation length, whereas in Fig. 8(a), the output SNR is converged. The gain obtained by the proposed ST-FRSOMP and S-FRSOMP over OMP is higher in Fig. 8(c) than in Fig. 8(a). For example, in Fig. 8(c), the gain obtained by the proposed ST-FRSOMP and S-FRSOMP estimators over the OMP estimator is 1.63 and 0.98 dB under 37.65 ms, whereas in Fig. 8(a), it is 0.87 and 0.51 dB. It can also be observed that all the sparse channel estimators obtain higher output SNR in Fig. 8(c) than that in Fig. 8(a), when the observation length is long. The reason is that, in Fig. 8(c) and (d), CE-DFE with IC we proposed has the capability to mitigate the CoI. It shows that our proposed ST-FRSOMP channel estimation algorithm combined with our proposed CE-DFE with IC achieves the best output SNR in Fig. 8(c) and (d), because the CoI is suppressed, in both channel estimation and channel equalization. While under short observation length (less than 40 ms), all the sparse channel estimators obtain lower output SNR in Fig. 8(c) than that in Fig. 8(a). It is because short observation length obtains bad channel estimates which lead to inaccurate equalizer coefficients (f m , b m , and r m,k ). The same results are achieved in Fig. 8(b) and (d).
In the decision-directed mode, the previous detected symbols are used in both channel estimation and Doppler estimation. Because the Doppler scale is small, we use narrowband Doppler estimation and compensation [38]. A 0.25-s preamble is used to estimate the initial channel impulse responses and Doppler estimation. Channel estimates obtained by the five channel estimators (ST-FRSOMP, ST-SOMP, S-FRSOMP, S-SOMP, and OMP) are used to measure the filter coefficients of CE-DFE without IC and our CE-DFE with IC. The parameters are the same as in the training mode listed in Table II. The observation length is set to 47.1 ms. Periodical training sequences are utilized to prevent error propagation. The periodical training symbols accounting for 25% of the transmitted symbols are used to adjust the channel estimates and the Doppler estimates.
The ST-FRSOMP, ST-SOMP, S-FRSOMP, S-SOMP, and OMP estimators achieve the output SNR of 5.21, 4.98, 4.72, 4.42, and 4.24 dB, in Fig. 9(a); the output SNR of 4.48, 4.28, 4.07, 3.82, and 3.57 dB in Fig. 9(b); the output SNR of 6.67, 6.20, 5.85, 5.40, and 5.04 dB in Fig. 9(c); and the output SNR of 6.33, 5.84, 5.44, 4.97, and 4.62 dB in Fig. 9(d), respectively. Our proposed ST-FRSOMP channel estimator combined with our proposed CE-DFE with IC achieves the highest output SNR in both Fig. 9(c) and (d). The comparison results in the decision-directed mode are consistent with those in the training mode. It can be observed that, in Fig. 9(a) and (b), the output SNR shows an oscillation feature, because of the error propagation which is caused by a strong CoI.    Table III, the BER after LDPC decoding does not exhibit much improvement or is even worse. While in Table IV, the BER is reduced significantly after LDPC channel coding, for example, the raw BER obtained by ST-FRSOMP is 5.33% and 6.47% for Tx1 and Tx2, after channel coding, and the BER is 0% and 0.81%, respectively. The results in Table IV show that our proposed ST-FRSOMP channel estimation algorithm combined with our proposed CE-DFE with IC achieves significantly low BER.

C. Sea Experiment
To further verify the performance of the proposed channel estimation and channel equalization methods, we conducted the sea trial in Wuyuanbay, Xiamen, China. The depth at the test site was about 10 m. A 2 × 8 MIMO system was used; two transmitters were suspended to the depth of 4 and 6 m, respectively; and an eight-element vertical array covered the depth range of 1-8 m, with a spacing of 1 m. The range was roughly 1000 m. The QPSK symbols with a symbol rate of 3200 symbol/s and a carrier frequency of 15 500 Hz were transmitted in a duration of 8 s. The average received SNR was 33.5 dB. The sparsity S was set to 20, and the number of adjacent data blocks Q was set to 2. The parameters for MIMO system are shown in Table V. Fig. 10 shows examples of the sea trial channel estimates. The channel length was set to 62.5 ms corresponding to 200 symbols. To obtain more exact channel estimates, the observation length (187.5 ms) was set to three times the channel length. The channel   estimation algorithm is LSQR. From the two subfigures, it is obvious that the multipath structures are sparse. Moreover, it is evident that there are more than two multipath taps that have common delays. Meanwhile, multipath taps that have different delays can be easily found. We utilize the forward-reverse strategy to improve the estimation of such taps that have different delays. Fig. 11 shows the output SNR versus observation length in the training mode. In the sea experimental results, our ST-FRSOMP estimator achieved the best output SNR. For example, the output SNR obtained by ST-FRSOMP, ST-SOMP, S-FRSOMP, S-SOMP, and OMP is 4.78, 4.61, 4.38, 4.23, and 4.06 dB in Fig. 11(a), and 6.32, 5.96, 5.59, 5.31, and 4.88 dB in Fig. 11(c),  Fig. 11, our proposed ST-FRSOMP channel estimation algorithm combined with our proposed CE-DFE with IC achieves the best output OSNR. Fig. 12 is the output SNR in the decision-directed mode. The training length is 93.8 ms. To prevent error propagation, the training symbols accounted for 20% of the transmitted symbols that are used for adjusting the channel estimation and the Doppler estimation. The average output SNR obtained by ST-FRSOMP, ST-SOMP, S-FRSOMP, S-SOMP, and OMP is 4.16, 4.00, 4.05, 3.95, and 3.82 dB in Fig. 12(a); 6.37, 6.30, 6.26, 6.19, and 6.16 dB in Fig. 12(b); 7.24, 7.08, 7.04, 6.93, and 6.65 dB in Fig. 12(c); and 8.50, 8.34, 8.28, 8.13, and 7.93 dB in Fig. 12(d), respectively. Our proposed ST-FRSOMP algorithm combined with our proposed CE-DFE with IC still achieves the highest output SNR. Note that, in Fig. 12(b) and (d), the output SNR varies with time because of the Doppler variant.  Finally, some constellations obtained from the sea trial are provided in Fig. 13. Our proposed ST-FRSOMP channel estimation algorithm combined with our proposed DFE with IC methods obtains the most separated constellation in Fig. 13(b) and (d) compared with the traditional channel estimation algorithm combined with the traditional CE-DFE without IC.

VI. CONCLUSION
For underwater acoustic MIMO communications, the channels that come from different transmitters to a receiver, and the channels that come from the adjacent data blocks, may exhibit strong correlation. The DCS method is able to use such correlation to improve channel estimation performance. In this article, we used the SOMP algorithm to enhance channel estimates that had common delays under the DCS framework. In this way, the CoI was suppressed. To address the different delays, we introduced the forward-reverse method to SOMP, which was referred to as the FRSOMP algorithm. FRSOMP performed 1) the SOMP algorithm to get initial channel estimates; 2) forward-add process to add promising candidates into the support sets based on the distributed compressed method; and 3) reverse-fetch process to check if the candidates in the support sets were available. Thus, it was possible to remove the incorrect candidates. We also proposed a CE-DFE structure that contained the IC component for MIMO equalization. We derived coefficients for feedforward filters, feedback filters, and IC filters based on the DCS channel estimates.
The numerical simulation, lake experiment, and sea experiment results showed that the output SNR obtained by the proposed methods (our ST-FRSOMP combined with our CE-DFE with IC) obtained higher output SNR, lower BER, and more separated constellations compared with the traditional OMP algorithm combined with the traditional CE-DFE without IC component. It should be noted that better performance achieved by the proposed methods is at a cost of higher computational complexity.

APPENDIX A DERIVATION OF FILTER COEFFICIENTS FOR CE-DFE WITH IC
We assume that the channel estimates and the past decoding symbol are accurate. To obtain the solution of equalization coefficients, we perform a partial derivative as follows: Thus, the optimal coefficients are obtained aŝ where We assume that x m is with a variance of σI and independent of noise and channel estimates. Then, substituting (23b) into (31), we yield and where Y = E(y n y H n ) = M k=1 (σq k q H k + σC k C H k + R k ).