A Novel Architecture of Radix-3 Singlepath Delay Feedback (R3SDF) FFT Using MCSLA

ABSTRACT


INTRODUCTION
Discrete Fourier transform (DFT) is crucial in recent telecommunications and digital signal processing, though this method tends to be computationally rigorous. To conquer this problem, Cooley and Tukey developed the fast Fourier transform (FFT), which has verified predominantly expensive for applications involving orthogonal frequency division multiplexing (OFDM), such as Worldwide Interoperability for Microwave Access (WiMAX), long-term evolution (LTE), asymmetric digital subscriber line (DSL), very-high-speed DSL, and digital audio/video broadcasting (DAB/DVB) systems.
To reduce power consumption and hardware costs, different types of FFT processors has been developed. The memory-based architecture gives a low-power result, though this method suffers from long delay and may need extra buffer space for system synchronization. The pipelined Single-path Delay Feedback (SDF) FFT architecture has been developed to reduce the memory mandatory for memory-based architectures. This approach includes N−1 delay elements, in which the multiplication accounts for less than 50% of the computation and the control unit design is relatively straightforward. These features are specific advantageous in high-performance designs involving portable digital signal processing devices.
Fixed radix FFT's such as radix 3 FFT are considered to be competitively proficient to radix 2 FFT. In this paper, a new algorithm of pipelined radix-3 SDF FFT using MCSLA has been designed to reduce the number of multiplications. In this radix-3 FFT, the modified carry select adder has been used to perform addition operation to reduce the power consumption and also to improve the performance of the FFT processor.

BACKGROUND
A novel algorithm for execution of radix 3, 6, and 12 FFT has been explained in [1]. The FFT algorithm is evaluated in an ordinary (1,j) complex plane and the number of additions can be extensively reduced, the number of multiplication is also reduced. A well-organized approach to calculate Discrete Fourier Transform (DFT) using Radix-3 algorithm, which is a Fast Fourier Transform (FFT), has been described in [2]. Compared to existing one, it has less multiplication. The matrix created by various powers of twiddle factor is disintegrated into two matrices and it has been revealed that it takes the complex multiplications are less to calculate the result than unique Cooley-Tukey algorithm.
The hardware implementation of mixed radix FFTs with cores of radix 5 and radix 3 as well as the standard radix 2 core has been presented in [3]. The mixed radix FFT is more costly than the radix 2 implementation .A mixed radix FFT of 1200 points need 36 real multipliers in the implementation of pipelined FFT whereas a 2048 radix 2 FFT needs 30 real multipliers. A radix-3 FFT has been described in [4] which the element three-point DFT's needs no multiplications. This results in a reduction in the number of multiplications but a concurrent increase in the number of additions. The algorithm will show an advantage of processors which require more time for multiplication than addition.
A novel FFT algorithm has been developed in [5] together with the design of pipelined architecture. The proposed algorithm has been used to reduce the number of complex multipliers in addition to the size of twiddle factor ROMs. It is proved to be appropriate for large size of FFT VLSI implementation. These FFT architectures are designed for OFDM applications in [6]- [7]. A novel architecture for efficient method of Fast Fourier Transform (FFT) processor [8] to gather the necessities of high speed wireless communication system standards. This paper develops an optimal constant multiplication arithmetic design to multiply a fixed point input by means of one of the numerous current twiddle factor constants.

RADIX-3 FFT ALGORITHM
Radix-3 FFT algorithm is used to compute Discrete Fourier Transform (DFT).It takes less multiplication than the normal one.Radix-3 FFT algorithm is mainly based on divide and conquer method. It decomposes an N-point DFT into sequentially smaller DFTs.As soon as the number of data points is power of 3 (i.e., 3n).
The radix-3 algorithm for realization of DFT of length N=3n (n=1, 2, 3,…). The DFT of length N can be realized from three DFT sequences, each of length N/3.If the input signal has length N, direct calculation of DFT needs O(N2) complex multiplications.Radix-3 algorithm which is utilized to reduce the multiplications. The processing time and hardware complexity for implementing radix-3 DFT algorithm can be reduced.

PROPOSED PIPELINED STRUCTURE OF RADIX-3 SDF FFT USING MCSLA
In this paper, a new algorithm of pipelined structure based Radix-3 SDF FFT has been designed for improving the speed. Radix-3 FFT, which is used to reduce the number of multiplication. Single path Delay Feedback FFT is a pipelined based frequency transformation technique. In SDF FFT, the inputs are given in serial manner. The SDF FFT provides high speed operation. This FFT structure consumes more delay and power consumption [9] due to utilizing or storing bulk of unwanted intermediate processing signals. SDF FFT structures have the most proficient memory utilization for pipelined FFT processors. Figure 2 shows that the architecture of Radix-3 pipelined SDF FFT. This architecture consists of Processing Element (PE), Delay and Twiddle Factor values. Addition and subtraction operation has been performed in the processing element. Initially, the input data of real and imaginary values are given to the first stage. Then the input values are delayed by 1. The delayed values are given to the processing element.
In the processing element, the addition and subtraction operations are done. After that the values are multiplied by twiddle factor values. Finally, the first stage of output values is controlled by using multiplexor. The first stage output is fed back to the input of second stage. Similarly, the first stage operation has been done in the second stage and third stage. The main disadvantage of single path delay feedback FFT is large power consumption. To overcome this problem, modified carry select adder [10] has been integrated into Radix-3 SDF FFT to perform the efficient adder operation for reducing the power consumption. Modified the full adder structure in the normal CSLA [11] by reducing the number of gates, it is called as Modified Carry Select Adder. For performing the operation of 3-bit addition, Full Adder circuit consists of 2 XOR gate, 2 AND gate and 1 OR gate. The Full Adder gate count value is 13. RFA circuit [7] has been designed by using minimum number of logic gates. It consists of 2 AND gate, 1 OR gate, 2 NOT gate and 1 multiplexer. The RFA gate count value is 9.Multiplexer (MUX) based Reduced Full Adder circuit has been designed in this paper for improving the performance of digital adder circuits. The structure of modified CSLA is shown in Figure 3.
In the modified CSLA [7] has been used for reducing the power consumption and improving the performance of FFT processor. Compared to regular CSLA, the modified CSLA [7] gives better performance. Finally, the modified CSLA has been integrated into pipelined radix-3 SDF FFT.

RESULTS AND DISCUSSION
By using Verilog Hardware Description Language (Verilog HDL), the Radix-3 Single-path Delay Feedback (R3SDF) FFT using Modified Carry Select Adder (MCSLA) has been developed. The simulation and synthesis results have been evaluated and estimated by using ModelSim 6.3c and Xilinx 10.1i design tool. The simulation result of proposed Radix-3 pipelined SDF FFT using MCSLA is shown in Figure 4. Comparison analysis of Radix-3 SDF FFT and Radix-3 SDF FFT using MCSLA is shown in Figure 5.   Table 1 shows that the number of slices is 170 and100, the number of LUTs is 318 and180 , the delay is 15.173ns and 10.507ns and the power consumption is 475mW and 118mW in pipelined R3SDF FFT and R3SDF FFT using Modified Carry Select Adder.

CONCLUSION
In this paper, a novelty design of pipelined Radix-3 Singlepath Delay Feedback (R3SDF) FFT using modified CSLA has been proposed. The proposed radix-3 FFT has less number of multiplications compared to other FFTs.This algorithm is competitive in speed with radix 2 FFT. In pipelined Radix-3 SDF FFT, the area has been reduced but the delay and power consumption has been increased. In order to overcome this problem, modified carry select adder has been used to perform adder operation in Radix-3 SDF FFT. To reduce the area, latency and power consumption is the main motive of this paper. The proposed method offers 41.17% lessening in occupied slices, 43.39% decrease in LUTs, 30.75% reduction in delay and 75.15% lessening in power consumption than the Radix-3 SDF FFT.