Experimental Evaluation of Self-organized Backpressure Routing in a Wireless Mesh Backhaul of Small Cells

Small cells (SC) are low-power base stations designed to cope with the anticipated huge traﬃc growth of mobile communications. These increasing capacity requirements require the corresponding backhaul capacity to transport traﬃc from/to the core network. Since it is unlikely that ﬁber reaches every SC, a wireless mesh backhaul amongst SCs is expected to become popular. These low-cost deployments require to balance resource consumption amongst SCs, however, current routing protocols were not designed to fulﬁll this requirement. To tackle this challenge, we presented and developed with ns-3 a self-organized backpressure routing protocol (BP), designed to make the most out of the back-haul resources. This paper provides the evaluation of BP exploiting built in ns-3 emulation features to allow rapid prototyping under real-world conditions and through controlled ns-3 simulations. Through a novel evaluation methodology based on ns-3 emulation, we evaluate BP in a 12 SC indoor wireless mesh back-haul testbed under diﬀerent wireless link rates and topologies, showing Packet Delivery Ratio (PDR) gains of up to 50% with respect to shortest path (SP). Through simulations, we show BP scalability properties with both the size of the backhaul and the number of backhaul radios per SC. Results in single-and multi-radio deployments show TCP traﬃc gains with BP of up to 79% and 95%


Introduction
The ever increasing demand for wireless data services has given a starring role to dense deployments of low-power base stations referred to as small cell (SCs), as increasing frequency re-use by reducing cell size has historically been the most effective and simple way to increase capacity. Such densification en-5 tails challenges at the Transport Network Layer (TNL), since hard-wired backhaul deployments of SCs prove to be cost-prohibitive and inflexible. The main challenge is to provide cost-effective and dynamic TNL solutions for dense and semi-planned SC deployments. An approach to decrease costs and augment the dynamicity at the TNL is the creation of a wireless mesh network [1] amongst 10 SCs to carry control and data plane traffic to/from the core network [2].
A wireless mesh backhaul requires of practical routing schemes realizing an even resource consumption to ease the mentioned capacity crunch. An identified requirement when steering traffic is to dynamically grow or shrink the pool of SC resources according to network conditions, thus, exploiting the capacity 15 offered by the wireless mesh backhaul. However, the IEEE 802.11s [3] mesh standard specifies Hybrid Wireless Mesh Protocol (HWMP) (see section 2), a tree-based protocol oriented to provide network connectivity rather than the exploitation of resources. Aiming for capacity, the original centralized backpressure algorithm [4] proved to be throughput optimal in theory. However, its 20 implementation under real-world conditions showed scalability problems with the number of flows and forces the wireless network to operate on a Time Division Multiple Access (TDMA) MAC [5]. To counteract these issues, in [6,7], we presented and evaluated through ns-3 simulations [8] a self-organized backpressure routing protocol (BP) for the TNL that dynamically grows or shrinks 25 SC resources according to network conditions (see section 2). Specifically [6] focuses on multi-gateway SC deployments, whereas in [7] the focus is on sparse wireless mesh backhaul deployments. Additionally, in [9] we presented the integration details and preliminary experimental results of our scheme using ns-3 emulation features (see section 2). However, the evaluation in [6] was merely 30 based on simulations in a single-radio single-channel backhaul deployments, and the experimental evaluation in [9] was scarce.
The contribution of this paper is two-fold. First, we detail the configuration of the experimental platform (see section 3) using ns-3 emulation composed by 12 SCs endowed with 3G for the Radio Access Network (RAN), and an 35 additional WiFi card to form a WiFi-based mesh backhaul amongst them (see Second, we tested BP in a wide variety of realistic wireless mesh backhaul conditions (see section 5). In particular, using ns-3 emulation we demonstrate the operation of BP under different wireless link configurations (wireless link 45 rate, ambient noise reduction techniques) and showed how by switching on and off SCs in the testbed, BP adapts to dynamic wireless mesh backhaul deployments. We observed gains of around 50% in terms of PDR with regards to a shortest path (SP) routing policy. Additionally, we demonstrated through ns-3 simulations the scalability of BP with both the number of SCs, and the number 50 of backhaul radios per SC forming a multi-radio multi-channel wireless mesh network. We provide first simulation results of BP with TCP traffic showing promising advantages. In particular, BP showed significant gains over SP of up to 79% and 95% in terms of throughput and latency, respectively, with respect to SP routing. 55 We conclude the paper and provide future possibilities with section 6.

Emulation
Fall [10] classifies emulation into two types. In environment emulation, an implementation environment is built so that real protocol implementations can 60 be executed in a simulator, whilst in network emulation simulated components interact with real world implementations. As for the former, Network Simulation Cradle (NSC) [11] is a pioneer of introducing real stacks into network simulators. Recently, in [12] a framework for executing Linux kernel code in the ns-3 simulator was presented. As for the later, which is the focus in this paper, top of the real physical device uses MAC spoofing to avoid conflicts between the guest ns-3 IP stack running in user space and the native IP stack. Therefore, the EmuNetDevice can send over the physical device packets that have assigned MAC addresses different from those identifying the real device. In this way, 75 packets received by the EmuNetDevice are sent to the ns-3 IP stack whenever the MAC destination address corresponds to the spoofed ns-3 MAC address.

Routing for WMNs
Transport schemes for the mobile backhaul, such as [14], assume that they will be rolled out with a highly reliable infrastructure (e.g., fiber) underneath.

80
This is not the case of low-cost wireless mesh networks [1] that are unstable, unreliable, and have scarce resources. With such constraints, finding routing paths is a fundamental problem for a WMN. The IEEE 802.11s [3] mesh standard specifies the Hybrid Wireless Mesh Protocol (HWMP), a tree-based protocol oriented to provide network connectivity rather than the required capacity.

85
The comprehensive survey in [15] classifies routing protocols for WMNs into two main groups: multi-radio and opportunistic routing protocols. Although multi-radio routing protocols aim to exploit the added capacity brought by multi-radio multi-channel WMNs, they rely on the establishment of fixed single/multiple end-to-end paths assuming a high network stability regarding path 90 qualities. Such an assumption may be inappropriate under unreliable and lowcost WMN deployments. On the other hand, opportunistic routing protocols take routing decisions on a hop-by-hop basis. Their design is characterized by deferring the next-hop decision after the packet has been transmitted. Although these protocols aim to address the inherent unreliability of WMNs, they incur 95 into an extra coordination overhead amongst the possible set of forwarders, and potential generation of packet duplicates .
In terms of methodologies, the evaluation of routing schemes under real world conditions is of primal importance. Researchers have recently built a number of WMN testbeds to evaluate the sources of performance degradation of routing 100 protocols in real environments. Previous work [16] characterized the sources of noise in an indoor deployment using 802.11b/g evaluating different routing protocols. In outdoor deployments [17], sources of noise are more frequent due to the interference of external interferences of non WiFi devices. Our work in terms of testbed deployment in this paper is similar to that in [18] in the sense 105 that we also deploy an indoor WMN using 802.11a technology as backhaul.

Self-organized BP for the Wireless Mesh Backhaul
Unlike routing families previously described, our scheme [6], referred to as self-organized Backpressure Routing (BP), is based on a decentralized flavor of the original centralized backpressure algorithm [4] that promises through-110 put optimality assisted by scalable geographic routing [19]. As observed in [5] the centralized backpressure algorithm limits its implementation to small centralized wireless networks. First, it maintains centralized routing tables, and a queue per every flow in every node. In addition, it does not provide any guarantees in terms of latency and it forces the wireless network to operate on 115 a Time Division Multiple Access (TDMA) MAC, as it is originally proposed in theory. In contrast, as detailed in [6], BP relies on a single queue backlog per node and geolocation information to take per-packet routing decisions that dynamically grow or shrink path utilization according to traffic conditions. For each packet being routed in SC i , wireless link weights with all potential SC 120 neighbors j according to w ij = ∆Q ij − V × cost. The wireless link selected for transmission is the one that maximizes the computation of w ij . Note also that the protocol is agnostic to the wireless technology underneath. All the necessary information to compute weights (queue backlog and geolocation) is exchanged amongst SCs by means of periodic emission of HELLO broadcast messages. In 125 essence, the protocol aims to transmit packets to the less loaded SC, however, such a decision may incur a cost based on the geographic progress towards the intended destination and the importance assigned to this cost function denoted by the value assigned to the V parameter.
The importance of this cost function with respect to minimization of queue 130 backlog differentials is determined by the value of the V parameter. BP in-cludes a self-organized algorithm to calculate the V value on a per-packet basis.
This algorithm aims to find the proper trade-off between balancing resource consumption (based on the minimization of queue backlog differentials) and the cost function (based on geographic progress to the destination). As detailed in [6], 135 for each packet being routed in where j includes the queue backlogs of the SCs in a 1-hop distance plus the queue backlog of SC i . In this way, to mitigate congestion SC i decreases V i , hence, decreasing the importance associated to the cost function, whereas under no congestion SC i would increase V i , hence, increasing the importance of 140 the geographic cost function.
We use geographic greedy routing as cost function to enable the scalability BP needs for its practical implementation. The resulting scalable BP scheme showed gains in terms of adaptability to wireless mesh backhaul dynamics in the preliminary work in [7] and in uplink traffic communications in [6] showing 145 not only promising gains in throughput but also in latency. Results to support these features are, hitherto, obtained through ns-3 [8] simulations, which are insufficient by themselves.

Testbed Configuration
This section first describes the indoor all-wireless Network of SCs (NoS) [2] 150 developed in the first floor of the CTTC building over an approximate indoor area of 1200 square meters (see Figure 3). In second subsection, we describe how we used the ns-3 emulation to run BP in the testbed. Third subsection provides configuration details of the WiFi mesh backhaul. • Core Network: The Core Network Emulator in Figure 1 implements, besides other MNL functionalities, the Iuh interface to which the 3G RAN part of the SC can connect through a WiFi mesh backhaul.

Testbed Description
• TNL GW: One SCs acts also as a TNL GW [6]. This component is in 170 charge of pulling/injecting packets from/to the wireless mesh backhaul.
• User Equipment: The User Equipment (UE) are based on laptops equipped with an UMTS PCMCIA card attached to the 3G SCs.

Configuration of BP using ns-3 emulation
As previously noted, each SC is directly connected to a WiFi router through 175 Ethernet, hence forming a 3G SC with WiFi as backhaul interface. On top of the WiFi backhaul interface, we associate a single ns-3 IP stack including the implementation of BP at the ns-3 routing layer. Through the ns-3 EmuNetDevice (see Figure 2(a)), we associate to every ns-3 IP stack a different MAC address from that identifying the WiFi card underneath. In this way, ns-3 packets can 180 be send to the real testbed, and real packets can be also intercepted by the ns-3 stack. With this particular method any network protocol can be implemented in ns-3 without modifications, as long as it works above layer 2. This method allowed to evaluate BP in a real testbed re-using the ns-3 code developed in our previous work [6,7] in a real SC. Note also that BP needs geolocation informa-185 tion in order to compute the weights of every link. Geographic coordinates are statically assigned to each ns-3 IP stack so as to form a 4x3 grid (see Figure 3) with a step of size 2.
One lesson learned from using the ns-3 emulation framework on top of a WiFi card is the importance in the specification of the MAC address associated to the 190 ns-3 EmuNetDevice class. In the MadWiFi driver, which is used in our platform to manage WiFi cards, the Basic Service Set Identifier (BSSID) mask specifies the common bits that a MAC address must match in order to process a receiving packet. Therefore, the spoofed MAC address associated to the EmuNetDevice object in the ns-3 emulator must comply the restrictions imposed by the BSSID broadcast rate of the probing packets and the measurement method (active or passive) are some of the key parameters to take into account [15]. In Figure 3, we show the resulting L3 connectivity patterns when sending HELLO broadcast messages at a physical rate of 54Mbps. We observed that every SC has at least two direct neighbors and no more than four.

Testbed Characterization
Prior to evaluate BP in the testbed, we need to characterize the particularities of our designed prototype, such as, the quality of the WiFi links and the performance provided by the ns-3 emulator.

240
We use HELLO messages generated by BP, which is rolled out in each SCs. We

Emulation Characterization
The goal here is to assess the type of traffic patterns that the ns-3 emulator can handle in the testbed without introducing performance degradation issues that could bias the interpretation of evaluation results. Note that as recently showed in [13], ns-3 emulation may experience performance degradation prob-    the workload and the achieved goodput measured at the receiver for 6 UDP flows regardless the WiFi link rate configuration. This experiment reveals that due to both the collisions frequent in a single channel mesh network and poor channel quality, the attained goodput is lower at a fixed data rate of 54Mbps than that attained with a fixed data rate of 36Mbps. In fact, best goodput results with all 335 the backhaul links configured to a data rate of 36Mbps. Figure 6(b) confirms that the predominant WiFi link rate attempted by the SampleRate autorate algorithm is that of 36Mbps. Note that the Sample rate intends to estimate the maximum link rate experiencing a high reception probability. We also see from Figure 6(a) that the SampleRate autorate algorithm experiences the higher 340 variability due to the autorate configuration experienced by all WiFi cards.     Second, we observe very short instants in which BP experiences a dramatic degradation of throughput. Figure 9(a) reveals that this corresponds to the time required for BP to decrease its V value so that the SC forming the link with the highest BP (i.e., differential of queue backlogs) is chosen as next hop by 390 the source SC. Figure 9(b) reveals how the source SC never overflows, since the variable-V algorithm leverages the multiple available paths offered by the mesh backhaul testbed. The reconfiguration time of the V parameter is function of the rate at which the data queues in source SC fills up, which is around 1s in this experiment (one flow of 1Mbps) to choose a longer path. In fact, this time 395 would be substantially reduced with the increase of the rate in the source SC.

Adaptability to the Wireless Backhaul Topology
Nonetheless, in [7], we have started to propose some modifications in the BP protocol to reduce the extra latency introduced by this reconfiguration time.
Third, BP periodically shows throughput peaks when the shortest path to the destination is available again. Such a throughput peak is generated by the 400 aggregation of newly generated traffic (at a rate of 85 packets per second) and all the data packets accumulated at the data queue in source SC. Once the queue is empty, the TNL GW serves 1Mbps. As showed by Figure 9(a), the V parameter in source SC evolves by increasing its value implying the use of geolocation information in routing decisions.

Interaction with TCP Traffic
The goal of these experiments is to illustrate through ns-3 simulations how BP scales and interacts with TCP traffic. In particular we simulate larger SC deployment areas thus, complementing previous results obtained in the testbed deployed in a confined indoor space. Simulation allowed us not only to show 425 scalability with the size of the network but with the number of backhaul radios per SC. In particular, we tested single-radio and multi-radio SCs, including the multi-radio backhaul two backhaul radios and two orthogonal channels per SC.
By its very design, BP can be deployed without changing the TCP stack. The fundamental goal of TCP, which applies to all TCP variants, is to provide a 430 reliable datagram service by means of end-to-end ACK packets. Based on the received ACKs, TCP determines whether and how many data packets should be injected into the network by updating its window size.

Single TCP Flow Case
Our first experiment compares BP and SP with a single TCP flow in a 435 4x4 grid mesh backhaul. From Figure 11(a), we can observe significant latency gains of BP with respect to SP. The use of all possible paths increases the level of contention in the network in BP with respect to SP, which explains the lower throughput obtained when distributing traffic, achieving 2.5Mbps when injecting 7Mbps, 9Mbps, and 11Mbps. Note that contention would decrease 440 with the use of multiple channels in the network, which will be studied in next section. While contention leads to throughput in both protocols, SP suffers also from a dramatic increase in latency due to the increase of queue backlogs. As depicted in Figure 11(b) BP does not experience such a latency increase since it balances the use of all possible paths to improve network resource utilization. An 445 interesting observation is that when introducing and additional non-overlapping WiFi radio for backhauling in every SC is how the decrease of contention and additional capacity is exploited by both BP and SP, which are labeled as BP-MR and SP-MR in multi-radio multi-channel experiments. However, gains are more significant with BP rather than with SP since contention is the parameter 450 more relieved with the addition of an orthogonal channel in the backhaul.
We analyzed a second experiment where we consider a reference 2Mbps TCP traffic flow sent during 50s. We left background traffic during 30s (from instant t = 15s to t = 45s) that overlaps with the fixed path taken by SP and a subset of the paths taken by the reference TCP flow with BP. Figure 11

Several TCP Flows Case
The third experiment compares the performance of BP and SP when injecting a variable amount of TCP flows of 2Mbps in a 7x7 grid mesh backhaul. We   Figures 12(a) and 12(b)) are tested with two radios and two channels per SC. Comparison results show that BP-MR better exploits these added resources than SP-MR. When injecting 6 TCP traffic flows BP-MR shows substantial gains of up to 79% and 95% in terms of throughput and latency, respectively. Indeed, both SP and, in a minor extent, SP-MR suffer 480 from contention and higher queuing as TCP flows share their path towards destination. In the multi-radio case, SP-MR is unable to exploit the added resources brought by multiple radios as BP-MR does. In particular, BP-MR showed gains of up to 114% and 81% compared to BP, while SP-MR showed gains of up to 4% and 44% compared to SP in terms of throughput and latency, 485 respectively. These results indicate the convenience of BP over SP in both single-radio and multi-radio networks.

Conclusions
This paper has presented an extensive performance evaluation of BP both experimentally and by simulation. Our experimental methodology is based on 490 ns-3 emulation, which allowed us to rapidly run our ns-3 implementation of BP practically unmodified in a WiFi-based mesh backhaul of small cells prototype.
We tested BP under a wide variety of conditions in the testbed. With dynamic backhaul configurations BP showed fast adaptability to topology conditions leading to PDR improvements of up to 50% against a SP routing policy. We have 495 demonstrated the scalability of BP with the size of the network and the number of backhaul radios through simulation showing substantial benefits over a SP routing policy. In multi-radio multi-channel backhaul deployments, we observed BP improvements with respect to SP for TCP traffic of up to 79% and 95% in terms of throughput and latency, respectively.

Acknowledgements
This work has been funded by the Spanish Ministry of Science and Innovation under grant TEC2011-29700-C02-01. We would also like to thank the reviewers for their constructive and helpful comments.