A Modified Diagonal Mesh Shuffle Exchange Interconnection Network

Received Dec 20, 2016 Revised Feb 24, 2017 Accepted Mar 16, 2017 Interconnection network is an important part of the digital system. The interconnection mainly describes the topology of the network along with the routing algorithm and flow control mechanism. The topology of the network plays an important role on the performance of the system. Mesh interconnection network was the simplest topology, but has the limited bisection bandwidth on the other hand torus and diagonal mesh was having long links. The Modified diagonal mesh network tried to replace the torodial links but was having more average path length so in proposed topology we have tried to improve the average distance using shuffle exchange network over the boundary node. In this paper, we propose the architecture of Modified Diagonal Mesh Shuffle Exchange Interconnection Network. This Modified Diagonal Mesh Shuffle Exchange Interconnection network have been compared with four popular topologies that are simple 2D Mesh, 2D Torus, Diagonal Mesh and Modified Diagonal Mesh Interconnection Network on the four traffic patterns such as Bit Complement traffic, Neighbor traffic, Tornado traffic and Uniform traffic are used for comparisonand performance analysis. We have performed the analysis with a 5% and 10% of hotspot on the Uniform Traffic. The simulation results shows that the proposed topology is performed better on bit complement traffic and can also handle the other traffic up to certain level. Keyword:


INTRODUCTION
The digital system primarily consist of three things: processing units, memory units and interconnection network which lies in between of both [1]. The interconnection network initially was single bus. It was shared by all the processing units and memory units generally referred as resources. The demands of all the resources could not be satisfied by the single bus so they were transformed to multiple buses. This leads to the mathematical definition of the interconnection. An interconnection network can be viewed as graph G (V, E) with "V" vertices and "E" edges. The vertices represent the processing and memory nodes and the link represents the edges of the graph. The cost of designing a multiple buses to each and every node proves to be high. This leads to solution based on the tile architecture proposed by [2]. It has been highlighted in [3] that tile architecture could have thousands of cores. The architecture of SoC is greatly dependent on the application so there is a need for exploring the new topologies [4].The first basic tile architecture was simple mesh topology. The most important parameter associated with this type of topology 1043 was high diameter, which was reduced in the modified version named as torus topology. The main issue with torus topology is the large links connecting the nodes that are distance apart. When we use the long toridial links in the torus and diagonal mesh the latency between the various nodes is not uniform. The variants like diagonal mesh were having further small diameter. They had increased the cost of the hardware and not works for all values of "n". In the paper [5], the modified diagonal mesh interconnection network (MDMIN) has proposed and removed the torodial links. The detail analysis of the traffic suitable for this type of topology had not been suggested in the paper [5]. In this paper, we have identify the ideal traffic for the MDMIN and also propose a new variant based on MDMIN that is modified diagonal mesh with shuffle exchange network. We focus on suggesting a new variant of mesh interconnection network called as modified diagonal mesh with shuffle exchange interconnection network (MDMSEIN) and compare its performance with existing mesh interconnection named as simple 2D mesh, 2D torus, diagonal mesh (DMESH) and modified diagonal mesh interconnection network (MDMIN) on various traffic patterns such as bit complement traffic, neighbor traffic, tornado traffic and uniform traffic. We have also analyzed the performance of the proposed topology with the 5% and 10% of hotspot on the uniform traffic. The paper has been divided into five sections. Section 2 introduces the various mesh interconnection networks and the different types of the traffics. The proposed MDMSEIN is shown in section 3. In Section 4, we have discussed simulation environment and results on various traffic patterns using OMNeT++ followed by conclusion and references.

PRELIMINARIES AND BACKGROUND
Mesh was the direct topology, where each switch was connected to the core or processing element. Mesh interconnection network had been widely used to connect the processor. It was popular due to its simplicity which makes it easy to implement. The mesh topology or its variants had been used in supercomputers like AP3000, Ametak 2010 [6][7][8] Cray T3D, CrayT3E [9], Fujitsu, Intel Touchstone [10] and so on. The meshes are very simplest in nature which leads to the exploration of different variants of the meshes like torus, diagonal torus and many more [11]. The high speed processing requirement nowadays consists of communication between the multiprocessor forcing to search for the highspeed interconnection. The topology helps in making the selection to get higher throughput at the optimal cost. It also effects the processor utilization and processing power. In particular, we focus on mesh interconnection especially diagonal torodial mesh and suggest its variant. Simple and torus mesh interconnection network have been widely used in the commercial high computations devices [12][13][14][15]. The mesh and torus interconnection networks are described in the Figure 1 and 2 respectively. They can be describes as graph "G" having the "n" vertices and "E" edges. In mesh interconnection, switches represent the node itself i.e. every node represent a processing element and switching element. The processing element is used for the computation. The switching element is the part of the communication and responsible for the routing the packets. The torus mesh is generated from the simple mesh by just adding few extra links or edges to connect the two extremes both horizontally and vertices. This extra links are added to reduce the diameter and increases the bandwidth of interconnection.

Diagonal Mesh Interconnection Network (DMESH)
DMESH is a variant of mesh interconnection network. The difference between DMESH and 2D Torus is that, in DMESH the nodes are connected diagonally instead of being connected horizontally and vertically. DMESH is described in the Figure 3. The DMESH was also having a lot of variants proposed by the various researchers [16][17][18][19]. Here, we suggested the modification on the MDMIN [5] as variant of diagonal mesh described in the Figure 4. The MDMIN is having the horizontal and vertical link connections over the boundaries nodes and the diagonal nodes on the interior nodes.

Definition of Network Traffic Patterns
The traffic pattern is defined as the spatial distribution of messages over the nodes, it is represented by the matrix ˄ and the element of the matrix λ s,d describes the number of packets from source to destination. Definition 2.2.1 Bit Complement Traffic: Address of the destination node is the bitwise complement of the address of the source node. It can be given by the equation (1) [1], [20][21]. Here "N" is the number of nodes. This makes the probability of each node as uniform [1], [20], [21]. Definition 2.2.3 Neighborhood Traffic: Here, each node sends the data to the corresponding diagonal neighbor and it is given by equation (3) and equation (4), where "k" is the number of nodes in particular dimension [1], [20].

Definition 2.2.4 Tornado Traffic:
This traffic is similar to the neighbor traffic but packets are sending to the nodes at half of the distance that is given by the two expressions one for x and one for y-component. It is described in equations (5) and (6), where "k" is the number of nodes in particular dimension [1], [20].
Definition 2.3.5 Hotspot: When a node receives higher traffic in comparison to the other nodes, then we say that there is a hotspot in the network. The hot spot traffic is mathematically represented as described in equation (7).
where "c" is an arbitrary node on which the traffic is moving to [1].

MODIFIED DIAGONAL MESH SHUFFLE EXCHANGE INTERCONNECTION NETWORK (MDMSEIN)
Modified diagonal mesh with shuffle exchange interconnection network is an enhancement over the MDMIN. The MDMIN was having low average path length over the torus mesh and diagonal mesh, so to improve the performance of MDMIN further introduces a shuffle exchange network on the horizontal and vertical links. The shuffle exchange network is described in the Figure 5. The main objective behind selecting the shuffle exchange is that, it reduces the diameters along the horizontal and vertical node of the network. The diameter of the shuffle exchange network of "N" nodes is equal to 2*log 2 (N-1) [22]. And the modified IN of the 8X8 is described in the Figure 6. The edges in the topology are described by the equation (8). The assumptions while writing the equations are as follows: The nodes lie in the first quadrant of the coordinate system. For any node the coordinates are as (x, y). We can write the equations (8),(9),(10),(11) for x and ycoordinates as: where Xi and Yi are the sets given by the notations as follows: Similar equation in terms of Yi can be used to represent the y coordinate points.
To understand these equations, we have to study two cases: Case 1: The node is an internal node of the MDMSEIN interconnection. Let the node is say (1, 5) then we have to use the equation (10) to find X i as x satisfies the first condition so the set X i is given as: X i = {0, 2} Similarly Y i is given as: Y i = {4, 6}. Now according to equation (9) we get the f(x, y) as the Cartesian product of the two sets are as follows: f(x,y)={(0, 4),(0, 6), (2,4), (2,6)} After computing f(x, y), we have to compute g(x, y) from equation (11) Figure 6, we verify that the node (0, 7) is connected to these three nodes.

EVALUATION OF MDMSEIN AND OTHER MESHES ON TRAFFIC PATTERNS USING OMNET++ 4.1. Experimental Setup and Testbed
In this section, we presented the simulation results. We have used Lenovo System having Intel Core 2 CPU T5200@1.60 GHz, 2 GB of RAM running with Windows 7 and OMNeT++ Simulator version 4.4.1 for the simulation. Definition 4.1.1 OMNeT: OMNeT++ is a framework based on eclipse IDE to provide the graphical environment for the simulations. The OMNeT++ uses the modular concept and have component based library for simulation of both the wired and wireless networks [23]. Figure 7 shows that, each node is having the 3 components which are app, routing and queues. These queues are further connected by the link to another node by the channel based on topological design IN used. To understand the working of each component, we should know the role of each. The app is the module that responsible for generating and receiving the packets that comes from the destination. The app module is simply behaving as the source and sink. We have to define the pattern of our traffic in the app module itself. The second module is the routing. We have to provide the routing algorithm for topology. In our case, we have used the distance vector routing which is default implemented in the OMNeT++. It uses the routing table for finding the optimal path which has been decided based on the neighbor of nodes for each node. The queues are used to store the packets that are arriving from the node and particular channel, and the packets to be sent by a particular channel. During the simulation, we have designed the queues and there has no restriction on the size of the queue that means, we can store any number of packets in the buffer. During the simulation there is no drop of packet due to the queue size. The channels used are bidirectional but in some topologies the links are unidirectional. In this way, the channels are connected using the in-ports and out-ports. This makes the channel unidirectional.
Definitions 4.1.2 Network Throughput: Throughput is defined as the number of successful message that is received in per unit of time. In a system we calculate the aggregate throughput, which is the sum of throughput of each and every node [19], [20].
Definitions 4.1.3 Network latency: Network latency is defined as the time taken for the messages from the source to destination. The latency of the network will increases when network begins to saturates as the packet has to wait in the network for large fraction of time [19], [20].

Load factor:
It is the ratio of the amount of the traffic that must cross channel "c" if each input injects one unit of traffic according to the given traffic pattern. The load factor more understand by example described in Figure 8. If there is a channel and it is busy for all the time of observation, we can say that load on the channel is 100%. If the traffic is half of the time period the channel is ideal then, we can say that load on the channel is 50% [21], [24]. Therefore, we can describe the load on the various topologies using inter packet delay.

Comparative Analysis and Discussion
The parameters used in the experimental setup used for Scenario 1 and Scenario 2 are described in Table 1 below.

Uniform Traffic
We can see from the graph which shows in Figure 9 that both the MDMIN and MDMSEIN interconnection are comparable to each other in performance but the mesh has saturated quite earlier at lower load. The torus network shows the increase in latency at quite lower load in comparison to that of MDMIN and MDMSEIN. We can see that, the better performance for packet injection rate is 10.24 µs and DMESH is giving the best results, this is because the DMESH is having higher bisection width and high degree of nodes this increases the cost of the IN. In MDMIN and MDMSEIN, we have removed the long torodial links, because the long torodials links results in varying latency and brings the non uniformity in the network.

Bit Complement Traffic
In the case of bit complement traffic, it is observed from the graph which shows in Figure 10 that MDMSEIN has given good performance. The MDMIN has also given low latency for initial loads but as the load increases, the latency has also increased and became almost equal to that of torus network. Later for higher load latency of torus became more than MDMIN but proposed MDMSEIN has given better performance using all the different loads based on the packet injection rate. The MDMSEIN has given the The results of MDMSEIN are also comparable to that of DMESH upto the inter packet arrival delay of 13.65 µs.

Tornado Traffic
From the Figure 11 we observed that, the performance of the proposed and MDMIN is comparable to that torus mesh and is better than that of the simple mesh even though the length of the wire in the case of torus is higher. The number of torodial links is more but, the diagonal mesh is having high performance due to the fact of double bisection width and complex diagonal interconnection of the diagonal mesh.

Neighbor Traffic
The performance of the MDMSEIN and MDMIN is again comparable to that of the torus and found better than that of the simple mesh as described in Figure 12. We found, these performances without using the long torodial links. The neighbor traffic is considered to be important for any IN because, the application is communicate more to the neighbor in comparison to that of distant node. The positive point here is that, we have achieved the performance better than that of DMESH which is having the double bisection bandwidth in comparison to that of MDMSEIN and MDMIN. Scenario 2 In scenario 2, we have considered the hotspot traffic of 5% and 10% along with the uniform traffic and the results have been discussed as follows:

Uniform Traffic with hotspot 5%
The performance of traffic at the 5% of hotspot is described in Figure 13. It shows the increase in the latency of each of the 2D mesh topology significantly compared to the uniform traffic with no hotspots. The other topologies 2D torus, MDMIN, MDMSEIN and DMESH are not significantly affected by the hotspots. The results of 2D torus, MDMIN and MDMSEIN are almost identical.

Uniform Traffic with hotspot 10%
The performance of traffic described in the Figure 13 and 14. We found that at lower hotspot the performance of MDMIN and MDMSEIN are affected but, when the hotspot has increased from 5% to 10% percent the latency of the torus is also significantly reduced in comparison to that of MDMIN

CONCLUSION AND FUTURE WORK
We have proposed a new topology based on the existing topology that is MDMIN. From Figure 9 we found that the MDMIN and MDMSEIN are better than torus network over the Uniform traffic. Figure 10 shows that MDMSEIN has improved over MDMIN its previous successor. In case of tornado traffic they are just near to the torus topology. In case of neighbor traffic shows the improvement but MDMSEIN is slightly slow in comparison to that of MDMIN. From the entire Figure 9 to 12 we can conclude that MDMSEIN can be used for the applications based on the bit complement traffic and neighbor traffic. The Figure 13 and 14 conclude that interconnection is also managed the hotspots in the networks. In the case of 5% hotspot we have got the result comparable to that of torus as the hotspots selected the topology is favorable to that of torus. The latency of torus appears to be good but as the hotspot has been raised to 10% then torus begins to saturate. The topology itself is not responsible for the performance. It is also affected by the routing algorithm and flow control mechanism used. Therefore, in future we will investigate the deterministic and adaptive routing algorithms suitable for the proposed topology because it can be seen as shortest path algorithm but may not be efficient in the case of congestion so we have to search for the adaptive algorithm for the proposed topology.