Design and Development of IP for Modified Haar Wavelet Transform (MHWT) Image Fusion using FPGA

The fast growth in the field of digital imaging applications in remote sensing, bio medical and other satellite applications had created an architecture studies for image fusion in capable to store large amount of data and process. An algorithm considered for the process of image fusion for implementation of FPGA is Modified Haar Wavelet Transform (MHWT) based image fusion where at the time four pixels are consider in calculation of different bands as compared to conventional Haar wavelet based image fusion. The process of modification uses far less memory and computation power. The FPGA implementation of MHWT based image fusion is done on Digilent development board with Spartan 6 series FPGA. The architecture is developed in VHDL. The timing analysis is done and report is obtained for I/O interactions, memory units etc. The architecture is made to run in cosimulation with Simulink. The design is tested with different kinds of images and run successfully. The visual analysis of the resultant fused image is achived and observed.


INTRODUCTION
The modern technological advancements and progressive needs of human beings have led image processing as one of the popular subjects, in a particular image fusion. This is mainly due to fast growth in the field of digital imaging which is most used in remote sensing, bio medical and other satellite applications this has led to a situation where the system should be capable of storing large amount of data and process it as fast as possible. The tasks involved in DWT systems are complex and hence need larger computational power. This leads us to thinking of developing a specialized hardware which will greatly reduce the processing time. Also, the use of proposed modified algorithms would greatly reduce time taken in process. Hence the proposition is made to implement reconfigurable hardware design of the image fusion. Field Programmable Gate Array (FPGA) chip supports reconfigurable designs and is considered as usable platform for image processing applications. The modern advents in technology have given us a faster computing chips which are being used in many fields which have gone one step further and are using complex image and signal processing in many fields, which have carved the way to implement such techniques on the chip.
The image fusion is the process by which two or more images are combined together into a single image. The image fusion normally involves images acquired from same source or different sources i.e. various sensors, based on this criteria images fusion is divide into several types as given in [1]. Here in this paper the main focus is on the study, analysis of various algorithms related to multimodal images and developing a hardware implementable design of the same. One among them is Image fusion using modified Haar Discrete Wavelet Transform (DWT).  Image fusion involves many techniques based on the parameter which are considered for image fusion  and there are different types of image fusion based on the sources of image, hence the knowledge of the  subject and background of the techniques involved will help us to develop the architecture accordingly. Image Fusion is a process where a composite image is produced, which may contain enhanced quality or simply better and more clear information about the object of interest or scene compared to individual source images. Image Fusion had its evolution beginning with the concept of simply averaging, where pixel densities of the source images are fetched from set of input images and their respective averages are calculated and combined together in form of matrix thus producing a fused image. A lot of advancements have happened in the field of image fusion since then employing advanced methods like Discrete Wavelet Transforms and Pyramidal Methods to fuse images [1].
There are several algorithms involved in the image fusion as stated in [2]. The primitive algorithms which perform image fusion process on the source images which includes mathematical operations such as averaging, addition and subtraction of pixel intensities of images to be fused. There exists some serious drawbacks [3].

IMAGE FUSION DISCRITE WAVELET TRANSFORM WITH HAAR WAVELET
Haar wavelet involves lowest complexity and it is the simplest wavelet in terms of mathematical computation, the only disadvantage is that as it is not continuous and therefore not differentiable [4]. The haar wavelet Ψ (t) can be described as in Equation (1).
And the scaling function Ø (t) is described as in Equation (2).
The FPGA implementation of image fusion using modified haar algorithm [5]. The only difference is author considers 4 nodes as compared to 2 nodes in normal haar DWT. The values of N/2 detail coefficients zero in each step than to find the N/2 detail coefficients by DWT.
The complete wavelet based 2D techniques for fusion are described in [6]. In all wavelet based image fusion techniques the wavelet transforms W is applied on each of the two registered input images I(x, y) and some kind of fusion rule, here denoted as Ø , here in this paper it is given by show in below Equation (3).
Consider a signal of length N, given in equation (5) ( ) In MHWT, first obtained average sub signal is, at one stage for a signal of length N i.e. is given in Equation (6)   (6) And first sub signal with lower frequency components holds all the details as in equation (7). Where, The only difference between standard haar wavelet transform and modified haar wavelet transform is that in MHWT only four nodes are considered instead of as in haar DWT [5], that is at each stage of wavelet decomposition the data is reduce twice as in normal DWT and the computational complexity is reduced to half, and hence is best for the hardware implementation. The literature review conducted gives us an insight on what is missing that is the part if image fusion that can be addressed, hence the identified research gaps are as follows.  Most of the papers cover newer and better versions of image fusing algorithm based on quantitative and qualitative analysis. No scope has been given to optimize the algorithms in terms of computational power.  Typical algorithms using standard wavelets such as Haar, Daubechies etc, have complexity level of N in contrary to that MHWT based image fusion [5] proposes modified haar wavelet transform with complexity of N/2.  The MHWT proposes the said algorithm but does not implement it over hardware, the process is formulated with modified equations and implemented on MATLAB.  This paper proposes the implementation of the said algorithm over FPGA which can for which the hardware design is developed and implemented.

DESIGN AND IMPLEMENTATION
The hardware model is developed for image fusion using Modified Haar wavelet transformation for FPGA implementation. The process will be using hardware software co-simulation [7] [8] [9] [10]. The basic block diagram for MHWT based image fusion is show in Figure 1. As the concept of 2D DWT using different wavelets and have been covered in background study and literature review the explanation will be brief only covering the part of the equation where the actual modification is made, the theory of equations and background study is done form [5][11] [12].
The MHWT equations in digital domain are represented as follows. The wavelet function is converted to Haar Operator as given in Equation (8).
The hardware model is developed for image fusion using Modified Haar wavelet transformation for FPGA implementation. The process will be using hardware software co-simulation [7] [8] [9] [10]. The basic block diagram for MHWT based image fusion is show in Figure 1. As the concept of 2D DWT using different wavelets and have been covered in background study and literature review the explanation will be brief only covering the part of the equation where the actual modification is made, the theory of equations and background study is done form [5][11] [12].
For 2D image, Let x be 2×2 matrix of an image, the transformation y is obtained by multiplying columns of matrix "x" by matrix "T", and then the rows of the result by multiplying by T T using Equation (9).
The original values are recovered using equation (10) For image matrix with starting row pixels a, b, c, d.  To use this transform, All the pixels from image are grouped into 2x2 blocks and transformations are obtained using equations [11].Then later after two successive transforms on image we get LL, LH, HL and HH for first image and for second image LL1, LH1, HL1 and HH1. The components obtained are fused based on the min max logic as shown in equation (12), (13) and (14).
The MHWT equations in digital domain are represented as follows. The wavelet function is converted to Haar Operator as given in equation (8) The values of A, B, C, D are held in buffer and are reconstructed in the form of matrix to form the fused image. At the time of implementation the serialization, deserialization and the actual fusion rule is implemented in Simulink, The Xilinx system generator is used to co-simulate VHDL design with Simulink.

Hardware level design for MHWT
The details of logic/ RTL design is shown in Figure 2. The MHWT algorithm is used for design phase. The transform block will work in coordination of clock with other functional blocks in the hardware design. The total implementation is combination of Simulink and the MHWT FPGA design.
In the VHDL design done for MHWT, the block composes of mainly one input for image data i.e. "Image_in" the data however is integer but with only 11 bits of information for each pixel. This is done just to be sure not to miss out the information if there is a higher value. In the actual design at later stage many down sampling are done to eliminate the computational and clock complexities. The Design uses a moving window architecture to formed by array of D-flip flops, the image data is taken in at every clock cycle for 256x256 image, all the values are held in array of flip flop which is used as functional block along with transform block as entity, the data is taped out form flip flops as required for computation as shown in Figure  3. The logic is implemented as according to the equations specified before in digital domain.  Based on the architecture the source code is written in VHDL (VHSIC Hardware Description Language), the Input ports are defined accordingly and at later stage the design is made to run at the initiation from Simulink which is explained in later part. This VHDL implementation works for one image at a time and the problem is one need to test the design then the image data needs to be there in serial format, hence the image should to be first converted into text file that is readable in VHDL using standard set of libraries.
Then the pixels values in that text file can be accessed serially into the VHDL design. Hence the Simulink co-simulation is considered for the total implementation of the fusion as two images need to be accessed at a given time. The Simulink design is divided into two designs i.e. software model which helps us in co-simulating entire system on the computer itself and another is hardware model where the Only MHDWT part runs on Spartan 6 series FPGA, hence both models can be addressed as software co-simulation and hardware cosimulation.
The subsystem of DWT block for FPGA Implementation in Simulink is shown in Figure 4. Here again it can be seen that blocks with "in" and "out" gate ways these blocks determine the boundaries of our VHDL design, the in-between block called "subsystem" is completely VHDL design, hence before accessing the VHDL design it is must to put in-out blocks. The "subsystem" block again has a subsystem of its own as shown in Figure 5, now here we can see that the core VHDL design of the MHWT is used to obtain l_band and H_band, these blocks are consecutively used to obtain higher level decomposition. As seen in Figure 4. The MHWT is applied twice for L band and H band separately. Now all the bands obtained i.e. LL, LH, HL and HH are concatenated in form of matrix to obtain transformed image. The process is repeated for the second image, with DWT block for the full design, now the fusion rule is applied the average value among LL bands is taken obtained from both the images, similarly for higher frequency bands maximum value among the two is chosen, as we already know that the LL bands have maximum information and HH bands have the peripheral information, hence the averaging is done for lower frequency components. The biorthogonal IDWT block is used from Simulink itself which gives us the fused image which can be seen in Simulink. The system is implemented on Digilent Atlys Spartan-6 trainer board. It consists of Spartan-6 XC6SLX45-3CSG324 FPGA including 128 Mbyte with DDR2 class technology DRAM, Gigabit Ethernet, HDMI Video, and AC97 audio make the Atlys board an ideal host for complete digital systems built around the Xilinx MicroBlaze embedded processor. The board is powered by 5V power adapter, micro USB cable provides the interface between the computer and the board.

RESULTS AND DISCUSSIONS
The compilation and synthesis is done for the design based on the test bench the timing diagram is obtained, clock is run all the way till all the values are read in, the total simulation time for one image is about 560 milliseconds .i.e. until last set of transformation is done. The timing analysis obtained from the simulation is shown in Figure 7. The resultant images of the hardware co-simulation design is presented below, the Figure 8 shows the intermediate decomposition, Figure 9 shows the different types of input images. Figure 10, shows the resultant fusion, the image dataset is borrowed from Yu Liu"s blog site [12]. In Figure 8 it can be seen that the image is decomposed into LL, LH, HL and HH bands. Different images such as multifocal, multi temporal and multimodal images are used to test the design that how well it can fuse the two images results are analysed visually for quality. Medical Images with higher resolution and low noise is show in Figure 9.  Two source images used are multifocal i.e. two opposite parts of the images are out of focus. The resultant fusion is combination of two clear parts of both source images. The edges smoothly vanish into the image, this can be taken care by applying various noise cancellation methods. As MHWT is recent FPGA implementation and hence much of literature is unavailable.

CONCLUSION
The MHWT based Image fusion is realized with the help of design implemented through softwarehardware co-simulation. The MHWT algorithm is designed using VHDL using Spartan 6 FPGA. The further pre-processing on FPGA itself may allow increase in speed and quality. For now, the system behaves well with images with low noise content, especially medical images. As seen in result section for result 1 and 2 i.e. multi temporal medical images the fused image is much better compared to multimodal and multifocal images, the quality is analysed based on how much details and clarity is present in the image based on visual analysis. The system also gives good quality fused for over or under exposed images. Overall the Design is sleek, however there always scope for much more compression and further try to improve the quality of fusion can be done.  The computations can be further reduced and more efficient can be achieved with more research on algorithm.  Based on the project, high level complex algorithms can be designed for FPGA implementation.
 The FPGA design for MHWT based image fusion can be further made as an IP, and later can find its use in Soc"s , ASIC or large integrated systems where image fusion is necessary.