Improving The Performance of Viterbi Decoder using Window System

ABSTRACT


INTRODUCTION
The spectrum of frequencies is a very limited and in demand resource. Its distribution is severely regulated by the International Telecommunication Union. That is why one of the most important objective in the design of a digital communication system is the maximum exploitation of the available spectrum. Traditionally, coding and modulation have been considered as two separate entities [1]. In 1982, the work presented by Ungerboeck [2] showed that a joint optimization of the coding and the modulation was possible: the trellis coded modulation or TCM, which is based on the same principle as the block-coded modulation, but is of infinite dimension. The transmitted data forms semi-infinite or infinite sequences. The principle stated by Forney applies equally to modulations encoded in trellis and modulations coded in blocks. Nevertheless, trellis-coded modulations do not lead to elements of a network of points, since their dimensions are infinite. They are based on the partition of classical constellations (PSK or QAM) into subsets of higher minimum distance. Transitions from one subset to another are made in a convolutional code. The gain in terms of minimum distance must exceed the loss in terms of the number of useful bits per symbol so that the coded modulations are more efficient than the conventional ones.
In the first part of this work, we present the basic theory of trellis coded modulation. In the second part, we present the algorithms of decoding TCM encoder by classical Viterbi, MAP and its variants Log-MAP, max-log-Map decoder, and we present our proposed Viterbi which is improved by the window system. In addition, we also present a comparison between Viterbi soft and hard decoders shown over AWGN channel. To close this part we present the simulation which has been done using MATLAB program and the results as seen in AWGN channel with 2, 3 memories. In the third part, we propose a function which is called RSCPOLY2TRELLIS for recursive systematic convolutional (RSC) encoder which creates the trellis structure of a recursive systematic convolutional encoder from the matrix "H" and we present a simulation of the comparison between RSC TCM, 8PSK with Viterbi soft, MAP, Log-Map, and Max-log-map. Finally, the last section presents the conclusion of this paper.

RELATED WORKS
In this section, we present some related works which had used the decoding algorithms: Viterbi, MAP, and Log-Map moreover the encoder TCM with Ungerboeck mapping and Gray code mapping (UGM).
In [3], Manish Kumar and al had made a comparison in the latency performance of the RSC-RSC serial concatenated code using non-iterative concatenated Viterbi decoding to RS-RSC serial concatenated system codes using concatenation of Viterbi & Berklelamp-Massey decoding. The simulation results had shown that by increasing the code rate, the latency decreases & RSC-RSC is to be a better code rather than RS-RSC which has low latency. Hence RSC-RSC system is more suiTable for low latency applications.
In [4], Ilesanmi Banjo Oluwafemi improve the performance of two hybrid concatenated Super-Orthogonal Space-Time Trellis Codes « SOSTTC » topologies over flat fading channels. The encoding operation is based on the concatenation of convolutional codes, interleaving, and super-orthogonal spacetime trellis codes and the decoding of these two schemes were done by applying iterative decoding process were the symbol-by-symbol maximum a posteriori (MAP) decoder is used for the inner SOSTTC decoder and a bit-by-bit MAP decoder is used for the outer convolutional decoder.
In [5], the work of Sameer A Dawood and al had shown the effectiveness of turbo codes to develop a new approach for an OFDM system based on a Discrete Multiwavelet Critical-Sampling Transform (OFDM-DMWCST). They used of turbo coding in an OFDM-DMWCST system is useful in providing the desired performance at higher data rates. Two types of turbo codes were used in this work, i.e., Parallel Concatenated Convolutional Codes (PCCCs) and Serial Concatenated Convolutional Codes (SCCCs). In both types, the decoding is performed by the iterative decoding algorithm based on the log-MAP (Maximum A posteriori) algorithm.
In [6], Bassou and Djebbari introduced a new type of mapping called the Ungerboeck gray trellis coded modulation (TCM-UGM) for spectral efficiency greater than or equal to 3 bits/s/Hz. This TCM-UGM code outperforms the performance of Ungerboeck TCM code by 0.26 dB over Gaussian channel and 2.59 dB over Rayleigh fading channel at BER = 10-5. This technique is combined with our approach to getting more efficiency.
In [7], Trio and al had proposed a VLSI architecture to implement reversed-trellis TBCC (RT-TBCC) algorithm. This algorithm is designed by modifying direct-terminating maximum-likelihood (ML) decoding process to achieve better correction rate which reduces the computational time and resources compared to the existing solution.

TCM : TRELLIS CODED MODULATION
According to Ungerboeck in 1982, whatever the spectral efficiency considered for the transmission and for a code as complex as it can be, the asymptotic gain of coding given by a TCM is almost maximal using a single binary element of redundancy per symbol transmitted. Thus, for a TCM constructed from a constellation with M = 2n+1 points, the spectral efficiency of the transmission is n bits / s / Hz and the performances of the TCM are compared with those of a modulation to 2n states; that is to say, having a 2point constellation. The constellation of a TCM, therefore, has twice as many points as that of the uncoded modulation having the same spectral efficiency.
Suppose, therefore, that we want to transmit a block of n binary elements coming from the source of information. It is divided into two blocks of respective lengths, ñ and (n-ñ). The length of ñ block is then coded with a convolution encoder performance Rc = to v memories (2V states); the second block is unchanged. The (ñ + 1) bits from the encoder are then used to select a sub-constellation 2n-ñ point while the (n-ñ) uncoded bits are used to select a particular item in this sub-constellation. Figure 1 shows the synoptic diagram of an encoder for Ungerboeck TCM.

Rules for Building the Trellis
The implementation of the decoder requires the construction of a trellis of the TCM. To build such a trellis, some rules must be followed if one wishes to maximize the free distance. For this, Ungerboeck proposes the following three rules: The M =2n+1 signals of the initial (unpartitioned) constellation must be used with the same frequency. Figure 2 shows the set Partitioning method applied to the 8PSK [2] The 2n-ñ parallel branches, if they exist, must be associated with signals belonging to the same 2n-ñ sub-constellation.
The 2n branches that leave a state or reach a state must be associated with signals belonging to the same 2n point sub-constellation.
The first rule provides the trellis with a regular pattern. Rules 2 and 3 ensure that the free distance of the TCM is always greater than the minimum Euclidean distance of the uncoded modulation taken as reference for the coding gain calculation.
Thus the asymptotic coding gain is:

ALGORITHMS OF DECODING TCM ENCODER
The most common decoding is based on the Viterbi algorithm [8]. It consists in finding in the tree the path which corresponds to the most probable sequence, that is to say, that which is at the minimum distance of received sequence or the most probable sequence. The following section illustrates the Viterbi and MAP algorithms.

Algorithm of Viterbi
The aim of the maximum likelihood decoding is to look in the trellis code "C" the nearest road (most likely) of the received sequence (i.e. observation). The distance employed in the algorithm is either the Euclidean distance, in the case of soft-input, or the hamming distance, in the case of farms inputs.
Thus the decoding problem is: given ( per trellis interval, where determine the most likely transmitted path through the trellis. If we assume that the uses of the BSC (the binary symmetric channel) are independent (i.e., we have random errors), the problem reduces to minimizing the Hamming distance between the { } and our estimate of the{ }, denoted as { ̂} :  Figure 3. For each trellis transition we do the following. Let , = be the accumulated state metric, that is, the sum in (1) up to trellis interval i; the input for the discrete time transition is ( ) for each state, : b) Call the best ̂ ̂ ̂ the winning transition and store it as well as M ( ), the state metric for the winning transition. When these steps are implemented, we are left with one single transition path per state, per trellis interval. The collections of these winning transition paths over time are called survivor paths. The decoding path is then the minimum of overall survivor paths. In reality, one should wait until survivor paths merge, that is, when their initial segments coincide. In practice, one stores the result for trellis intervals and then makes the choice of the best survivor path; that is the path with the smallest m (.).
The Viterbi algorithm, therefore, requires the computation of 2kl metrics at each step t, hence a complexity of W ˟ , linear in W. However, the complexity remains exponential in k and L, which limits the use of the codes of small size (k L of 7 to 10 maximum). The width W of the decoding window is taken in practice to about 5L. This guarantees (empirically) that the survivors converge to a single path inside the decoding window. The Viterbi algorithm, therefore, requires the storage of cumulative metrics and survivors of length 5kL bits [9].

Algorithm of MAP
This algorithm is based on the calculation of the probability of occurrence of a bit (1 or 0) in a certain position. We have at our disposal a string of length T, which comes from the coding of an information word of size . The method consists in calculating iteratively the a posteriori probability of each bit, first as a function of the values of the probabilities for the bits preceding it, and then as a function of the posterior bits. For this reason, the algorithm is called "forward-backward algorithm". We place equal importance on the "before" bits and the "after" bits.
Here, Y is the string of bits received and t is the position of the bit in the string. Similarly, we have denoted , the set of transitions from state to state , as soon as we have had the bit "i" at the input. Let M be the number of possible states.
We try to calculate the log-likelihood ratio logarithm value (λ (u (t)) (log-likelihood ratio) : where u (t) denotes the output of the encoder.
For two given states, one defines a joint probability: (l", l) = P ( =l", = )= P (u(t)= i, = ) is the bit that sends in the l"to l ( is 0 when there is no transition from 1" to 1 ). We thus have the following relation: To calculate σ, we must introduce the joint probability density: where we have denoted y (1: n) the elements from "1" to "n" of the vector Y. Similarly, we define the conditional probability: Using the Bayes rule, we obtain the relation: But, following events, because time t does not depend on the sequence received up to this moment, the expression becomes: As we can put the term P (Y), we get: (l", l)= (l)* (l) For bit i.
Computing of and We try to calculate recursively. For this, we write : Then, summing for all the possible transitions from time t-1: Applying the Bayes rule, we find: where: (l", l)=P (U(t) = i, l, as characterizes the noise, which is white Gaussian, one can write: Here x (t) is the value that should have been at the output of the encoder, as soon as the state is changed from state to state.

Simplified versions of the MAP algorithm
The BCJR algorithm, or MAP, suffers from one important disadvantage: it must carry out many multiplications. In order to reduce this computational complexity, several simplified versions were introduced as (SOVA) in 1989 [10], the max-log-MAP algorithm 1990-1994 [11], [12], and the log-MAP algorithm in 1995 [13]. The multiplication operations are replaced by the addition, and three new variables are defined as A, B and Γ, as following: For the convolutional encoder of rate , we use the symbol by symbol MAP Algorithm for nonbinary trellises. Roughly speaking, we can state that the complexity of the BCJR algorithm is about three times that of the Viterbi algorithm.

THE PROPOSED APPROACH 5.1. Viterbi Improved by Window System
We saw in the previous sections how it was possible, using an algorithm, to correct an erroneous message. In this part, we talk about methods of encoding and decoding information that requires some computing power, which can cause some problems (for example in embedded systems where there is limited computing power and the calculation must be done in real time). Finally, only low error percentage messages (encoded without errors) can be corrected upon receipt. For this purpose, it is proposed in our approach to using the window system in the encoding and decoding phase of the information in order to minimize the error on a single window whose length equal to the number of memory plus 1.
In this section, we use the window in convolutional coding and decoding. To explain the difference between the window system and the classical approach, we choose to present the following example: Figure 4 shows the structure of the convolutional encoder with rate R=1/2. In Table 1 we see that "Y" is a function which depends on the input X and the values of states S0, S1 (X, S0, S1) [Yi = F (Xi, S0, S1)].
Initialization on values of S0 = 0; S1 = 0; Figure 5 shows the transmission chain of convolution encoder.  According to Table 1, we can distinguish that most of the "Xi" survive on many "Yi" by the size of the constraint of the code (in our case, the size of the constraint equals three (03)).
The disadvantage of this phenomenon is the decoding by the VITERBI algorithm. If all the bits of "Yi" are erroneous, the algorithm decodes a "Xi" erroneous, and this error expands on the size of the constraint of the code and consequently, all that follows is wrong. To overcome this phenomenon we propose the window on the size of the constraint of the code as indicated in Table 2.
a. The size of the constraint is 3 (it is the number of memory +1). Initialization: S0 = 0; S1 = 0; b. If all the bits of a "Yi" are erroneous, the algorithm decodes a "Xi "wrong, but only on a window.

RSCpoly2trellis for Recursive Systematic Convolutional (RSC) Encoder
The proposed function RSCpoly2trellis Syntax: Trellis = RSCpoly2trellis (H); Where RSCpoly2trellis function accepts a polynomial description of a recursive systematic convolutional (RSC) encoder and returns the corresponding trellis structure description. The output of RSCpoly2trellis is suiTable as an input to the convenc and vitdec functions and as a mask parameter for the Convolutional Encoder, Viterbi Decoder in the Communications Block. Figure 7 shows the recursive systematic convolutional encoder with rate which is represented by the matrix H:  Figure 8, Figure 9 and Figure 10 show a comparison between the classical Viterbi decoder and the Viterbi with window system decoder of TCM QPSK/8PSK and TCM joined by a new type of mapping called the Ungerboeck gray trellis coded modulation (TCM-UGM) described in section 2 "Related works" with rates of 1/2,1/3 and 2/3. a. We can observe, at high signal-to-noise ratios, that the simulation curve using Viterbi with window system outperforms the classical Viterbi by a gain equal to 1 dB at BER= . b. In order to investigate more performance, the Ungerboeck gray mapping is considered with TCM encoder. The simulation result using TCM-UGM and the Viterbi decoder with window system outperforms TCM and the Viterbi decoder with window system by a gain equal to 2.7 dB (approx) and outperforms TCM with a classical Viterbi decoder by a gain of 3.8 dB (approx).  The simulations illustrated in Figure 11 show that 4-States TCM-QPSK with Viterbi soft decoder outperforms 4-States TCM-QPSK with the Viterbi hard decoder, where a gain of 4 dB can be achieved by using Viterbi soft decoder at BER= . Figure 12 shows, at high signal-to-noise ratios, that the curve of log-map is the same as max-logmap decoders but both outperform the Viterbi soft and map algorithms and the gain can be easily obtained from the log-map or max-log-map which equal to 2 dB at BER compared to the Viterbi soft. The BER performance is as follows: BER Log-Map = BER max-log-map < BER Vitrebi Soft < BER MAP.

CONCLUSION
In this paper, the simulation in MATLAB was used to evaluate the performance of Viterbi using window system compared to the classical Viterbi. The simulation results over AWGN channel with rates 1/2, 1/3 and 2/3, have shown that at a BER of 10-6, the Viterbi decoder with window system outperforms the classical Viterbi by 1 dB. Moreover, we propose the use of the Ungerbeock gray mapping to achieve more performance with TCM encoder where the gain of 2.7dB was achieved compared to the TCM with the Viterbi decoder by window system and the gain of 3.8dB is observed compared to the original TCM with the classical Viterbi decoder.
From the above results, it can also be seen that with rate 2/3 the TCM with recursive systematic convolutional (RSC) encoder with log-map or max-log-map decoders gives better results than Viterbi soft with a gain of 2 dB at BER = 10-4 .
It is also clearly shown that Viterbi soft outperforms the MAP algorithm which is known as being greedy in memory space and in the computing time which increases more if we increase the rate and the number of memory. Consequently, the decoding operation becomes long. For this reason, they have simplified MAP with others varying as log-MAP and max-log-MAP that are used on the new trend of encoders like turbo-code where the researchers are placed a lot of hopes in this last technique (turbo codes) because we approach the limit given by the second theorem of Shannon.