A N E FFICIENT V ITERBI D ECODER

Convolutional encoding with Viterbi decoding is a good forward error correction technique suitable for channels affected by noise degradation. Fangled Viterbi decoders are variants of Viterbi decoder (VD) which decodes quicker and takes less memory with no error detection capability. Modified fangled takes it a step further by gaining one bit error correction and detection capability at the cost of doubling the computational complexity and processing time. A new efficient fangled Viterbi algorithm is proposed in this paper with less complexity and processing time along with 2 bit error correction capabilities. For 1 bit error correction for 14 bit input data, when compared with Modified fangled Viterbi decoder, computational complexity has come down by 36-43% and processing delay was halved. For a 2 bit error correction, when compared with Modified fangled decoder computational complexity decreased by 22-36%.


INTRODUCTION
Viterbi decoding was developed by Andrew J. Viterbi, is an Italian-American electrical engineer and businessman who co-founded Qualcomm Inc.His seminar paper titled "Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm", published in IEEE Transactions on Information Theory, in April, 1967 [1].Since then, other researchers have expanded on his work by finding good convolutional codes, exploring the performance limits of the technique, and varying decoder design parameters to optimize the implementation of the technique in hardware and software.Design and Implementation of Viterbi Decoder with FPGAs by M. Kivoja and et.al [6] have analyzed suitability of FPGA device architectures for implementing complex algorithms.They choose Viterbi algorithm as a deeper case study.Different architectural strategies for implementations are discussed and analyzed with the special emphasis on practical FPGA implementations.Speed performance, easy routability and minimization of inter-chip communication are used as design criteria.Viterbi decoder, constraint length seven, was designed and simulated with VHDL in Synopsys and Mentor tool environments and further implemented on four Xilinx 4028EX devices using trace-back based architecture.Also partitioning aspects of the decoding algorithm are presented and analyzed.A Fast Maximum-Likelihood Decoder for Convolutional Codes by Jon Feldman et.al [9], describes the Lazy Viterbi decoder which is a maximum-likelihood decoder for block and stream convolutional codes.For many codes of practical interest, under reasonable noise conditions, the lazy decoder is much faster than the original Viterbi decoder.A novel design and implementation of an online reconfigurable Viterbi decoder is proposed based on an area-efficient ACS architecture [12], in which the constraint length and trace back depth can be dynamically reconfigured.The Architecture can provide various throughput and energy trade-offs.Considering the throughput/energy performance metric, experimental results [15] show that design achieves improvements up to 26.1% compared with the previous designs.A novel systolic array-based architecture [17] with time multiplexing and arithmetic pipelining for implementing the proposed algorithm is used.The proposed algorithm can reduce by up to 70% the average number of ACS computations over that by using the non-adaptive Viterbi algorithm, without degradation in the error performance.Further, the total power consumption in the implementation of the proposed algorithm can be reduced by up to 43% compared to that in the implementation of the nonadaptive Viterbi algorithm, with a negligible increase in the hardware.FPGA implementation of Viterbi decoder by Hema S and et.al [21] describes the field-programmable gate array implementation of Viterbi Decoder with a constraint length of 11 and a code rate of 1/3.It shows that the larger the constraint length better will be the coding.In this paper an Efficient fangled viterbi decoder is discussed which will decrease computational complexity and also enhances error correction capabilities.This algorithm is less complex and takes less memory when compared to the existing Viterbi decoders but more than Fangled Viterbi Decoder with an advantage of two bit error correction capability.

VITERBI DECODER
The basic units of viterbi decoder are branch metrics, Add compare select and Survivor management unit.Figure 1 shows the general structure of a Viterbi decoder.It consist of three blocks: the branch metric unit (BMU), which computes metrics, the add-compare-select unit (ACSU), which selects the survivor paths for each trellis state, also finds the minimum path metric of the survivor paths and the survivor management unit (SMU), that is responsible for selecting the output based on the minimum path metric.Viterbi algorithm is called optimum algorithm since it minimizes the probability of error.The Viterbi algorithm can be explained briefly with the following three steps.
1. Weigh the trellis; that is, calculate the branch metrics.2. Recursively compute the shortest paths to time n, in terms of the shortest paths to time n-1.
In this step, decisions are used to recursively update the survivor path of the signal.This is known as add-compare-select (ACS) recursion.3. Recursively find the shortest path leading to each trellis state using the decisions from Step 2. The shortest path is called the survivor path for that state and the process is referred to as Input Output PMU ACSU SMU BMU survivor path decode.Finally, if all survivor paths are traced back in time, they merge into a unique path, which is the most likely signal path.
The following example, describes the process of Conventional Decoding: Consider the received sequence as 11 10 00 01 01 11 which is error free and should be decoded.The step by step procedure for decoding convolutionaly encoded data is given below.As soon as bits are received the following steps are repeated.
Pair of Bits is given to the branch metric block which calculates the possible error metrics at that particular state.Two techniques used to find the possible error are hamming distance and Euclidean distance.The hamming distance is the number of bits not matching the possibility and the Euclidean distance is the point distance between the possibilities and the received data, which is obtained using the point distance formula.Hamming distance is selected as it is easy to implement on hardware.Any state from stage three in the trellis diagram can be reached from two possible previous states thus two error metrics is obtained.The Add compare select unit finds both the path metrics and compares, whichever is minimum that path metric is chosen as the new path metrics.If both path metrics are equal then any one is chosen and is stored in path metric matrix.
Above two steps are repeated until the trellis ends and the entire path metric and next state metrics are obtained.Using these above matrices the survivor path traces the optimum path from last values of the next state matrix and then the data is decoded.The figure 2 shows the error free decoding of convolutional codes.Given the message data 101100 the encoded output is 11 10 00 01 01 11 which is received error free at the receiver.After completing the first two steps explained above the path metrics to reach each state in the trellis is obtained which is shown in red color just above the states.After calculation of the path metrics the survivor unit traces back the optimum path which will always start from state zero as shown in figure 2.

FANGLED VITERBI DECODER
Conventional Viterbi decoders have redundancies involved in path calculation, extra path and branch metric computation, added processing delay and higher memory requirement.These parameters closely depend on the constraint length of decoder i.e.K, higher the constraint length more robust is the code but every parameter is increased like number of computations, processing delay, memory requirement will increase exponentially, so will the device utilization and area required.
In fangled viterbi decoder , the redundancies are reduced, hence it take less processing time along with less memory but without any error correction capability.The working principle of Fangled decoding technique can be explained with the following example: For Fangled Decoding technique, considering the transmitted data as 11 10 00 01 01 11 but due to channel impairments the received data be 10 10 00 01 01 11 i.e. the received data is not as same as the transmitted one and the error is in second bit.The trellis decoding for the first two bits are shown in figure 3. Initially the first two bits are given to the branch metrics unit, which replicates the encoder and compares with the possible next state, at beginning it can either take 002 or 102 states.Calculating the branch metrics for the both the possible states is 1 as shown in figure 3 thus the ACSU has to choose between the two possible states but both are having same branch metrics thus a conflicting situation will arise at this stage.
ACSU randomly selects any of the state among the possible thus this conflicting situation is not clearly solved in the fangled scheme resulting in erroneous decoding of the data.
The fangled scheme works well for error free data but the disadvantages of this scheme is when data received with error at particular stage then conflicting situation will arise and the ACSU has to decide which path to choose as both the error metrics will be equal if one bit is received in error.Thus to avoid this, ACSU randomly chooses a path metrics which leads to incorrect decoding of bits.In [15], interleaver is been combined with the fangled scheme to get good results but this does not solve the above problem.As interleavers does the function of shuffling the bits in the frames which converts burst error to random error but still error is present.Thus decoding of erroneous data is the main disadvantage of the fangled scheme which can be overcome by slight modification of the algorithm which is explained in next section called Modified fangled scheme.Matlab results for convolution encoder and fangled viterbi with erroneous data is given to the decoder as shown in figure 4. The input given to the encoder is [1 0 1 1 0 0] theoretical output is [11 10 00 01 01 11 00 00] which is obtained as shown in above figure 4. The erroneous data is given to the decoder, the error is introduced at the 3 rd bit which is given as [11 00 00 01 01 11 00 00].The required output is [1 0 1 1 0 0] but is not obtained because fangled algorithm does not resolve the conflict.

ADVANTAGES AND DISADVANTAGES OF FANGLED VITERBI DECODER:
The advantage of this algorithm is its simplicity and ease in implementation, making Computation complexity low as only 14 additions and 7 comparisons is needed to decode 14 bit data.
Processing delay is low at 150 ns, while simulating on Xilinx ISE simulator taking clock period of 100 ns of duty cycle 50%, Device utilization is also very low i.e. 1019 and 1070 gate counts for no error and 1 bit error respectively.Fangled Viterbi decoder cannot detect error but can correct it depending upon the probability of being 50% true due to random choice of correct path by ACSU.This makes it incapable of working in noisy channel.

EFFICIENT FANGLED VITERBI DECODER
The name efficient is given in the sense, when compared with Fangled viterbi decoder, our design can correct one and two bit error along with reduction in computational complixity and memory requirement when compare with conventional viterbi decoder.The modified fangled decoder also has the same blocks as the fangled decoder: Working of encoder engine, BMU, ACSU & SMU is the same as Fangled Viterbi Decoder, with minor changes.From initial stage or first symbol the ACSU will compute two paths, which tend to coincide in case if no error is present or split and follows two branches in case of conflict caused by erroneous data.Now two survivor paths are maintained and correct decision is based on least path metric of the two.Trace back method is suitable for such method.

MODIFIED FANGLED TRELLIS DECODING
Modified algorithm is same as fangled algorithm when the received data is error free but it differs when there is error in the received data.When error is present in the received data the branch metrics of the two possible states are equal arising in conflict.
Considering the transmitted data as [11 10 00 01 01 11] but due to channel impairments the received data be [10 10 00 01 01 11] i.e. the received data is not same as the transmitted one and the error is in second bit.Modified algorithm takes two iterations when the conflict occurs.

First iteration:
The first two bits to be decoded are given to the branch metric unit which calculates the hamming distance between the possible states and the received data.At t=1 the possible states are 00 2 and 10 2 .The received bits are 10 as in example the outputs of the branch metrics is 1 for both the states.
If two branch metric values are equal then the first iteration of modified fangled assumes the lower state as its next state.The ACS of modified algorithm assumes the 10 2 state as its next state initially as shown in figure 5. Further it completes the trellis by taking other bits and calculating the path metrics of two possible states at each stage and considering the minimum amongst the two.Finally the path metric of the first iteration is obtained and is stored.The final path metric for the first iteration is 1 as in figure 5.

Second iteration:
The first two bits to be encoded are given to the branch metric unit which calculates the hamming distance between the possible states and the received data.At t=1 the possible states are 00 2 and 10 2 .The received bits are 10 as in example the outputs of the branch metrics is 1 for both the states.If two branch metric values are equal then the second iteration of modified fangled assumes the upper state as its next state.The ACS of modified algorithm assumes the 00 2 state as its next state as per first iteration which is as shown in the figure 6.Further it completes the trellis by taking other bits and calculating the path metrics of two possible states at each stage and considering the minimum amongst the two.At last the path metric of the second iteration is obtained and is stored.The final path metric for the second iteration is 3 and is as shown in figure 6.
Finally after completion of both the iterations two path metrics are obtained which are compared and minimum out of which is selected as our decoding path.The first path metric is 1 and the second path metric is 3, the minimum out of which is selected as the final decoding path.Finally the decoded data for the above selected path is [101100] which is our encoded data thus modified fangled scheme resolves the conflicts and gives the optimum result even if the data received is in error.

RESULTS OF CONVOLUTION ENCODING AND MODIFIED FANGLED DECODING
The input given to the encoder is [1 0 1 1 0 0 0 ] theoretical output is [11 10 00 01 01 11 00 00] which is obtained as shown in figure 7. The error free data is given to the decoder.The required output is [1 0 1 1 0 0 0] which is obtained and is shown in figure 7.
The input given to the encoder is [1 0 1 1 0 0] theoretical output is [11 10 00 01 01 11 00 00] which is obtained as shown in figure 8.The erroneous data is given to the decoder, the error is introduced at the 2 nd bit which is given as [10 10 00 01 01 11 00 00].The required output is [1 0 1 1 0 0] which is obtained and is shown in the above figure 8.

ADVANTAGES AND DISADVANTAGES OF MODIFIED FANGLED VITERBI DECODER:
Main advantage of Modified Fangled Viterbi decoder over Fangled Viterbi decoder is its 1 bit error correction capability.For this capability two paths are calculated and compared at the end of trellis, making path metric as decision criteria.
Disadvantage is the number of computations carried out i.e. 28 additions and 14 comparisons making it double as compared to Fangled Viterbi decoder in order to compute paths.
Processing delay is 300 ns for 100 ns clock period of 50% duty cycle, making it double of Fangled Viterbi decoder.Device utilization is slightly greater than fangled Viterbi decoder at 1113 and 1200 gate counts for no error and 1 bit error respectively in Xilinx Spartan-3 FPGA.It cannot be used for more than 1 bit error correction.It cannot be used in channel with burst error, though interleaver and de-interleaver are a solution for this.
Efficient fangled Viterbi decoder can solve most of the disadvantages mentioned above except device utilization which tends to increase slightly.

IMPLEMENTATION OF EFFICIENT FANGLED VITERBI DECODER (ONE / TWO BIT ERROR CORRECTION)
As shown in the block diagram in figure 9, Efficient Fangled Viterbi Decoder is an extension of fangled and similar to modified fangled but its core is the decision unit.This removes the redundancies in modified fangled design like extra computations and consequent memory requirement also adding 2 bit error correction capability.The whole design can again be bifurcated into: • Encoder engine • Branch metric calculation unit (BMU) • Decision unit • Survivor path calculation unit Encoder engine, BMU and SMU are same as explained in modified fangled viterbi decoder.BMU will calculate the hamming distance between received data and reference data to compute branch metric and SMU will store the decision bits from ACSU.
Decision unit is the advantage and the bottle neck of this design.As seen in modified fangled Viterbi algorithm, it can choose the best path or shortest path in terms of hamming distance at the end of trellis or received data.This leads to extra computations and processing delay.Decision unit takes care of these parameters by choosing the correct path by using the next received symbol.This will be further explained in subsections below using trellis diagrams.The flow chart in figure 10 shows the efficient fangled Viterbi algorithm.Decision unit is the part of ACSU and works in tandem with it.

EFFICIENT FANGLED TRELLIS DECODING:
Efficient fangled Viterbi decoders are the same as fangled algorithm when received data is error free.In case of an error the branch metric of the two states are equally arising in conflict, like error at symbol 2 and 4 in the trellis diagram shown in figure 11.In this case both the branches are having unity branch metrics, creating a problem for correct path decision.As criteria for decision is the path metric, the modified fangled Viterbi decoders took care of this by calculating both paths after conflict till the end and then comparing them for correct decoding, which also makes it incapable of decoding 2 bit error.Decision unit solves this problem by using the very next symbol to calculate the error metrics along each path.Then by comparing and selecting the path having least metric the decision metric gives the decoded output and selects the optimum path in the process.Thus resolution of conflict even for 2 bit errors becomes possible.In figure 11 optimum decoded path is shown by dark line.Rejected branches are shown by 'X' mark on them.Thus only in case of error decision unit is used, else the decoder works as fangled Viterbi decoder, hence can match its parameters closely.It will be further explained in the following subsections.

DECODING 1 BIT ERROR AT SYMBOL 2 USING EFFICIENT FANGLED VITERBI DECODER:
Consider figure 11, the transmitted data after encoding to be [11 10 00 10 00 10 11], but due to channel impairment 1 bit error occurred at fourth bit That is symbol 2 making the received data to be [11 11 00 10 00 10 11].So when the received data is same as transmitted efficient fangled will work as fangled Viterbi algorithm That is for symbol 1 whose decoded output will be '1' choosing '11' as correct branch as its branch metric is 0 and slashing away '00' whose branch metric is 2. Now for the second symbol '11', as shown in figure 11 path metric before error is always 0, and the two branches from state 1are '10' to state 2 and '01' to state 3 both having branch metric of 1.
The decision unit now kicks in, taking the very next symbol 3 i.e. '00' as an input to calculate path metric for ful fillment of decision criteria.As seen from the figure 11, next path metric along state 3 to state 3 via branch '10' increases the branch metric to 2, thus making any progress along this path redundant.The efficient fangled Viterbi algorithm can now effectively decide the correct path which is along state 2 via branch '10'.The algorithm resets the decision criteria i.e. the path metric to 0, thus making the algorithm usable for 2 bit error correction.
Practical implication of the discussion above is the algorithm's improvement in computational complexity.As we know that for decoding every symbol ACSU does 2 additions and 1 comparison which make it the bottleneck of the whole system.Use of decision unit which is only in the case of error takes 2 additions and 1 comparison as it will compute the two path error metrics and compare them.And as the requirement to change path arises as in the case of error at symbol 2, another 2 additions and 1 comparison is required for decoding the correct branch.

DECODING THE SECOND BIT ERROR AT SYMBOL 4 USING EFFICIENT FANGLED VITERBI DECODER:
Again consider the same figure 12, the transmitted data is the same, channel noise adds the second it error at bit position at 8 i.e. symbol 4 making the received data [11 11 00 11 00 10 11].
Again for error free symbol 3 the algorithm works as fangled Viterbi algorithm means only 2 additions and 1 comparison for decoding '00' as path metric was reset to 0 after error correction at symbol 2. Now for fourth symbol '11', error causes branch metrics of both the branches from state 1 to state 2 and state 3 to become 1 respectively.This causes the conflict, solution of which is given by decision unit.Taking path metric as decision criteria, the decision unit now takes symbol 5 which is '11' to compute the next path metric for the chosen path, however this process takes 2 addition and 1 comparison operation.Decision unit finds out that the path metric is increasing to 2 leading to the conclusion that present path is incorrect thus the need for changing the path.As for the new path again calculation has to made to decode symbol 4 which is '11', hence adding 2 more additions and 1 comparison to the computational complexity leading the overall tally to an extra of 4 additions and 1 comparison to correctly decode data under one bit error.

RESULTS OF CONVOLUTION ENCODING AND EFFICIENT FANGLED DECODING
Figure 13 shows Simulation result of Efficient Fangled Viterbi decoder for 1bit error at symbol 2 Here it is evident that decision unit was used for error at second symbol or third clock cycle giving proper decoded output of [1 0 1 0 1 0 0] at result port.This will take extra 4 additions and 2 computations making this decoder to take 18 additions and 9 comparisons to decode 14 bit data.This is not true for error in certain positions, like a bit error at symbol 1, 3 or 5, where decision unit does not need to switch path as proved in section 6.4 of chapter 6.In this case it will take 2 additions and one comparison such that the decoder will take only 16 additions and 8 comparisons to decode 14 bit data.Figure 14 shows the simulation result of Efficient Fangled Viterbi decoder for 2 bit error at symbol 2 and 4.Here for both the error symbols 2 and 4 decision unit has given the decoded output at clock cycle 3 and 5 respectively.Decision unit will take 4 additions and 2 comparisons per error symbol for correction.So to decode 14 bit received data with 2 bit error at this positions 22 additions and 11 comparisons are required.
Again computational complexity will be lower symbols 1, 3 and 5, thus it can be stated that for Efficient fangled viterbi decoder computational complexity is dependent on error position.

ADVANTAGE OF EFFICIENT FANGLED VITERBI DECODING ALGORITHM:
The advantage of Efiiceint viterbi decoder is the reduction of computational complexity to 36-43% compared to that of Modified fangled Viterbi decoder for one bit error -14 bit length data and 64%-78% for 2 bit error -14 bit data.As for 14 bit data i.e. 7 symbols modified fangled viterbi decoder takes 28 additions and 14 comparisons while efficient fangled can take either 2 or 4 additions and 1 or 2 comparisons respectively for various error positions as explained earlier.
When no error is present this decoder will work as Fangled Viterbi decoder taking only 14 additions and 7 comparisons to decode 14 bit input data.
Efficient fangled Viterbi decoder has error correction capability of 2 bits and the decision unit is well capable of correcting more than that given the next symbol after the error symbol is error free.

SHORTCOMING OF EFFICIENT FANGLED VITERBI DECODING ALGORITHM:
Main drawback of this algorithm is its dependency on the accuracy of decision unit rendering it inadequate to deal with burst error symbol data.It must be noted that 2 bit burst errors in the symbol itself cannot be corrected by using hamming distance (hard decision) decoding.As for error in consecutive symbols, consider figure 11 where errors in symbol 2 & 4 are corrected by using symbol 3 and 5 in the decision unit.If by any chance the symbol 3 or 5 would have had any error then choice of correct path would have been impossible, as decision unit would be rendered useless causing the whole decoding algorithm unit to fail, but this problem is only for 2 bit error correction.Solution of this problem is the use of interleaver and de-interleaver only after testing channel characteristic, or it will make normally distributed error into burst error.
Dependency on next bit leads to the algorithms incapability to correct error on the last symbol as there is no symbol in the trellis after that.This problem was automatically tackled by padding two zeros in the encoder, making the last two bits known as 0 prior to decoding them.

RESULTS AND CONCLUSSION
While designing a Viterbi decoder in any of its variant, each and every parameter cannot be improved, hence there always be a tradeoff between them, extent of which is to be decided by the required performance to be achieved from the decoder.The improvement of some parameters like computational complexity, processing time and error correction capability leads to the degradation of other parameters like increase in device utilization and hard ware complexity.Efficient fangled took these problems and improved them as shown in table 1.When compared with Fangled Viterbi decoders every other parameter seems to degrade except error correction capability which increased two folds to 2 bit.For 1 bit error correction for 14 bit input data, when compared with Modified fangled Viterbi decoder, computational complexity improved by 36-43% and processing delay was halved but device utilization substantially increased by 51% when implemented on FPGA Spartan 3.For 2 bit error correction, when compared with modified fangled decoder it showed 22-36%.
For 2 bit data it can also be compared to conventional Viterbi decoder, when compared in table 1 showed substantial improvements in every parameter.As shown computational complexity improved by 50-60% for 14 bit received data.Processing time and device utilization of Viterbi decoder for 2 error correction in 14 bit received data are 4.4 and 3 times respectively, when compared to Efficient fangled Viterbi decoder.

Figure 1 .
Figure 1.Block diagram of Viterbi decoder

Figure 2 .
Figure 2. Trellis diagram for error free decoding

Figure 4 .
Figure 4. Simulation results of Fangled Viterbi decoder for erroneous data.

Figure 7 .
Figure 7. Simulation results of modified fangled viterbi decoder for error free data.

Figure 8 .
Figure 8. Matlab simulation results of modified fangled viterbi decoder for erroneous data

Figure 13 .Figure 14 .
Figure 13.Simulation result of Efficient Fangled Viterbi decoder for 1bit error at symbol 2 Figure 14.Simulation result of Efficient Fangled Viterbi decoder for 2 bit error at symbol 2 & 4.

4.1. IMPLEMENTATION OF MODIFIED FANGLED VITERBI DECODER (ONE BIT ERROR CORRECTION)
Modified fangled Viterbi decoder deals with the drawback of fangled Viterbi decoder.That is no reliable error detection and correction capability.Modified Viterbi algorithm takes accumulated path metric as the criteria to decide correct path for 1 bit error correction.Modified fangled takes both paths in case of a conflict or error and compares them at the end for the choice of optimum path.Thus completely fails in case of noisy channel that can introduce burst error.It gains one bit error correction capability on the cost of double computation complexity and processing delay as shown later in subsections.This trait of Modified fangled Viterbi decoder makes it suitable to work in low noise channel applications but not for burst error.

Table 1 .
Fangled Viterbi decoder cannot detect error but can sometimes correct it, while calculating only one path.It can seen from table 1, that Fangled Viterbi decoder has the lowest computational complexity and device utilization (only 1019 gate counts) amongst other designs together with processing time of 150 ns, which is only matched by Efficient fangled Viterbi decoder.Modified Fangled Viterbi decoder was designed as a solution for above problems, by providing one bit error correction capability.But table 1 clearly shows the tradeoffs it has to make in order to do so.Computational complexity and processing time have doubled but device utilization remained almost the same with a small increase of 17%.Modified Fangled Viterbi decoder cannot correct 2 bit error.Comparison Table for decoding 14 bit input data from (2, 1, 2) encoder of constraint length 3.