Airtime-Based Resource Allocation Modeling for Network Slicing in IEEE 802.11 RANs

In this letter, we propose an airtime-based Resource Allocation (RA) model for network slicing in IEEE 802.11 Radio Access Networks (RANs). We formulate this problem as a Quadratically Constrained Quadratic Program (QCQP), where the overall queueing delay of the system is minimized while strict Ultra-Reliable Low Latency Communication (URLLC) constraints are respected. We evaluated our model using three different solvers where the optimal and feasible sets of airtime configurations were computed. We also validated our model with experimentation in real hardware. Our results show that the solution time for computing optimal and feasible configurations vary according to the slice’s demand distribution and the number of slices to be allocated. Our findings support the need for precise RA over IEEE 802.11 RANs and present the limitations of performing such optimizations at runtime.


I. INTRODUCTION
T HE fifth generation of mobile networks (5G) aims to enable applications to run with lower latency, more reliability, massive connectivity, and improved energy efficiency. The ITU Radiocommunication Sector (ITU-R) has defined three main uses for 5G: Enhanced Mobile Broadband (eMBB), Ultra-Reliable Low Latency Communication (URLLC), and Massive Machine Type Communications (mMTC); these are envisioned to be deployed in the coming years. Among the stringent requirements, low latency is seen as crucial and URLLC as the key enabler in this new age of connectivity.
Network slices, besides operating independently from one another, provide networking resource and traffic isolation among users and services [1]. In the literature, many proposals focus on network slicing for IEEE 802.11 networks [2]. Manuscript  Network slicing is being used to address the Resource Allocation (RA) problem in IEEE 802.11 Radio Access Networks (RANs), providing the required bandwidth and allocating resources accordingly [3]- [5]. However, in addition to the required bandwidth, End-to-End (E2E) latency and reliability should be considered as well. Moreover, deciding how to efficiently manage slice resources is still an open issue. Nowadays, advanced 5G services and applications have stringent Quality of Service (QoS) requirements, which make the RA problem even more complex. To cope with dynamic and unstable wireless environments and to meet QoS requirements, network slices might be instantiated, modified, and terminated at runtime. Therefore, such slices have to be properly and dynamically managed. To the best of our knowledge, this is the first work to exploit the flexibility of slice airtime allocation considering the IEEE 802.11 RAN resource availability and the stringent latency Key Performance Indicator (KPI) related to URLLC. We use a scheduling policy able to assign different airtime configurations to each slice, named Airtime Deficit Weighted Round Robin (ADWRR) [5]. Thus, each slice can be configured according to their priority.
In this letter, we focus on an RA model that optimizes the overall queueing delay of the system while QoS constraints are respected. Given that flows are distributed among slices that, in this case, represent abstractions of queues, we model this problem using concepts of Queuing Theory. For that, we propose a Quadratically Constrained Quadratic Program (QCQP). We evaluated our model using Advanced Process OPTimizer (APOPT), Interior Point OPTimizer (IPOPT), and Z3 [6] solvers. We then validated our model with experimentation in real hardware. Our results show that runtime optimization is limited by the demand distribution and the number of slices to be allocated. Our findings support the need for precise RA over IEEE 802.11 RANs and present the limitations of performing such optimizations at runtime.

II. IEEE RAN-RA PROBLEM FORMULATION
The IEEE 802.11 RAN consists of a set Access Points (APs) responsible to deliver data from different services to several users in the network, i.e., Stations (STAs). Each AP has resources to be shared and therefore has to be properly managed. Multiple tenants (i.e., virtual operators or service providers) share from the same infrastructure and have their specific Service Level Agreements (SLAs). These SLAs are translated into QoS requirements that the network has to support (e.g., minimum throughput, maximum allowed E2E latency, and acceptable packet loss ratio). To meet such requirements, we propose the use of network slicing. According to Richart et al. [3], there are two variants of network slicing. The first abstracts the different services and 1558-2558 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information. ensures QoS within them, which is referred to as Quality of Service Slicing (QoSS). The second defines slices as the traditional idea of network virtualization, where a precise subset of network resources is allocated to each tenant and full control is provided, which is called Infrastructure Sharing Slicing (ISS). As we focus on QoS within a slice as being a service, we use the QoSS variant.

A. Problem Statement & Assumptions
To represent the minimum chunk of wireless resources that can be assigned to a user, we use the Resource Block abstraction from Riggio et al. [7]. This abstraction defines a WiFi interface at a given AP, identified by the network interface identifier (e.g., Medium Access Control (MAC) address), operating channel (e.g., 1, 6, 11), and the type of channel (e.g., High Throughput (HT) 20MHz, Very High Throughput (VHT) 40MHz). Therefore, each resource block has its transmission capabilities, e.g., maximum dequeuing rate μ b MAX . For each resource block, frames are first classified into the different queues as slices, based on the definition of traffic rules (e.g., OpenFlow rules). Figure 1 depicts a simplified queue structure along with the data traffic flow within a single AP. Thereafter, frames belonging to such slices/queues are dequeued following the ADWRR scheduling scheme [5]. With such a scheduling scheme, different portions of airtime, a.k.a., quantums, are allocated to each slice s in each transmission round, denoted Q s . In this manner, larger values for Q s can be assigned to slices supporting services with stricter performance requirements and, hence, more radio resources are allocated.
According to the dequeuing process of the ADWRR scheduling policy, the hypervisor only serves traffic rules whose expected transmission time, estimated by a rate control algorithm (e.g., Minstrel ), is smaller than a deficit counter. In each transmission round of a given slice s, a quantum Q s is added to this deficit counter. If this counter is greater than the estimated airtime consumed to transmit the current frame, transmission occurs and the counter is decremented by the used airtime. Otherwise, if the transmission time of a given frame exceeds the deficit counter, this is held back until the next round of the scheduler. Besides, each traffic rule contains multiple aggregation buffers, one for each user in the slice, which can be scheduled as intended. Nevertheless, inactive traffic rules do not cause any performance degradation.
As stated in [5], the expected transmission airtime A for a packet L bits long can be approximated by: In Equation 1, R BEST is the Modulation and Coding Scheme (MCS) with the highest throughput and delivery probability of transmitting a frame and receiving the WiFi Acknowledgment (ACK) using the MCS R BEST . Note that Shortest Interframe Spacing (SIFS), DCF Interframe Spacing (DIFS), and the time for receiving the Acknowledgement T ACK are considered. Given that network flows are distributed among slices and slices represent abstractions of queues, we model this problem using Queuing Theory. Besides, since such flows' arrivals are independent of one another and the traffic pattern is random, we consider that flows' arrivals follow a Poisson distribution. Nonetheless, we acknowledge the inability to capture traffic burstiness that characterizes some data traffic patterns. However, our focus is to find the optimal assignment of Q s and validate it through real hardware experimentation.

B. Mathematical Model
The described network slicing RA problem can be approached through optimization techniques. In this section, we present a QCQP approach for constraints. Since slices represent queues and flow's arrivals follow a Poisson distribution with exponential services, in Kendall's notation, our model can be represented as sets of M/M/1. Given n services to be delivered by a resource block b, n slices are instantiated. For each slice s in the set of slices S b of a resource block b, the maximum dequeuing rate μ b MAX , the dequeuing rate λ s , the frame size FRM s size , and the airtime A s used for its transmissions are provided. With such an input, the average queueing delay W s of each slice can be calculated and the maximum average queueing delay W s QoS can be ensured. Besides respecting the service delivery rate and queueing delay constraints, we aim to find the optimal assignment of Q s that minimizes the overall queueing delay. Therefore, subject to the following set of constraints: 1) Queueing Delay: The average queueing delay W s of each slice s in resource block b is calculated as follows: where L s is the average number of frames in slice s. Notwithstanding, queueing delay constraints have to be ensured, hence, Nonetheless, the average number of frames L s is given by: where ρ s defines the service utilization of a given slice.
2) Service Utilization: The service utilization ρ s of a slice s in a resource block b is given by: subject to the actual dequeuing rate μ s , assigned to the slice s.
To prevent more arrivals than the server is capable of handling, we determine that the ρ has to respect the following constraint: ensuring that frames are dequeued at a faster pace than enqueued, thus preventing queues to grow indefinitely.

3) Service Delivery Rate:
According to the ADWRR algorithm, the rate in which frames are dequeued in a slice is given by its available airtime by the airtime used for its transmissions. This also has to consider all other competing slices within the same resource block. Therefore, the frame dequeuing rate μ s FRM is calculated as follows: Subsequently, the allocated dequeuing rate μ s is given by: It is important to emphasize that the computation of μ b MAX is not within the scope of this letter. Since μ b MAX depends on how lower MAC and physical layers would behave according to network conditions, we assume that any Software-Defined Networking (SDN)-based admission control system can be introduced to estimate such capacity. We focus on the assignment of airtime portions within the upper MAC layer, thus, we have considered μ b MAX as an input of our problem. Besides, μ s represents the dequeuing rate reserved for slice s, ensured by the hypervisor. Thus, in cases where the channel is saturated and the transmission limit of the resource block is reached, each of the slices receives its respective μ s . On the other hand, slices share the remaining resources equally. 4) Resource Allocation: Slices in a resource block can only allocate a limited amount of resources, i.e., μ s . Intuitively, the sum of such allocated resources within a resource block cannot exceed its maximum capabilities. However, to be able to optimize the overall queueing delay of the system, all the available resources of the resource block should be allocated: Besides, slices must provide the required dequeuing rate λ s .
Nevertheless, each resource block b must have enough resources to support its demands. Thus, Otherwise, there is no feasible solution. With such a model, our goal is to find values for the Q s decision variables that dictate the portion of airtime allocated to each slice on each resource block. In this manner, besides satisfying all performance constraints in terms of dequeuing rate and queueing delay, the remaining resources are optimally shared. Hence, reducing the overall queueing delay of the entire system.

A. Methodology and Workload
The parameters for our evaluation are defined in Table I. We set equal values for the airtime A s at each round of the ADWRR scheduling algorithm. Likewise, to ensure that the problem is always feasible, we set all W s QoS to 1 sec. To solve our model, we first identify if the problem is feasible using the Z3 Satisfiability Modulo Theories (SMT) solver [6]. Then, we check how fast the optimal solution can be found with APOPT and IPOPT solvers, using GEKKO [8].
As a benchmark, we used a laptop equipped with an Intel Core i7 (2,2 GHz) processor and 8 GB 1600 MHz DDR3. We opted to use APOPT and IPOPT because they are based on different approaches, which have fundamental differences in handling constraints. IPOPT is based on the Interior Point Method (IPM) in which has proven to scale very well to large Nonlinear Programming (NLP) problems with a small number of constraints. On the other hand, APOPT is based on active-set Sequential Quadratic Programming (SQP) in which scale very well to large NLP problems with a high number of constraints. Nonetheless, Z3 is an SMT solver which is used in several program analysis, verification, and test case generation projects. With Z3, is assured that a given problem is feasible and it has been proven to be fast in satisfiable instances.
To validate our model, we run an experiment in a single resource block using the parameters defined in Table I. A computer connected to the wired segment generates different flows towards a single STA, each passing through a different slice. To estimate μ b MAX , we generate bursts with frame size equal to 1024bytes until the maximum transmission capability is reached. With the μ b MAX , we draw the expected average queueing delay according to our model. Thereafter, we generate different flows following the Poisson distribution and we introduce up to 6 slices configured with equal Q s . Alongside this, we measured the queueing delay during the whole experiment run. We also verified when ≈50% of the resources remain to be allocated, avoiding channel saturation/conditions to affect the results. For this experimentation, we used an AP based on the PC Engines APU2D4 (x64) processing board, equipped with one Qualcomm Atheros AR958x 802.11 a/b/g/n and a Raspberry Pi 4 B with an 802.11b/g/n/ac for the STA. Figure 2 illustrates the objective function and the solution time for computing feasible and optimal solutions according to the number of slices. The results reported are the average of 10 runs. We can observe in Figure 2a that Z3 has reasonable performance in terms of execution time, especially for the problems where the number of slices is less than 7. However, nonlinear arithmetic is undecidable. Therefore, its semi-algorithm does not guarantee, for every input, a yes-orno-answer within a reasonable amount of time. Hence, for the larger instances of the problem where a solution is not found within 10min, we cancel the experiments. On the other hand, Figure 2b shows that both APOPT and IPOPT solve the problem for all sets of slices with the same objective. Moreover, as expected, both objective and solution time increase along with the number of slices, i.e., queues. In most cases, APOPT solves the problem faster and with the same optimal solution. Figure 2c presents the dequeuing rate and the achieved throughput while Figure 2d presents the number of active slices, measured queueing delay, and expected s∈S b W s for a given μ b MAX . In the first 70 seconds, we measure μ b MAX with 1720p/s while the burst of frames was introduced. Results shown from time 70 to 200 and from time 200 to the end are relative to μ b GAP of ≈50% and 10%, respectively. As we can see in Figure 2c, the measured queueing delay follows the expected pattern, increasing with the number of slices. However, when resources available are near the limit (e.g, ≈10%), due to channel conditions, devices' processing capabilities, or lower MAC and physical layer behavior, the measured queueing delay increases above the expected values. Therefore, it remains a challenge to estimate μ b MAX according to the current channel conditions and network configuration.

IV. CONCLUSION
In this letter, we have studied the optimal allocation of network slices in IEEE 802.11 RANs. We have proposed an airtime-based RA model where slices are configured to support strict QoS requirements such as the URLLC. To compute optimal and feasible solutions, we conducted experiments using APOPT, IPOPT, and Z3 solvers. Besides, we validated our model with experimentation in real hardware. Our implementation is publicly available on GitHub. 1 Our results show that the solution time for computing the optimal airtime configuration varies according to the number of slices to be allocated and the strictness of the QoS constraints. Our findings support the need for precise RA over IEEE 802.11 RANs and present the limitations of performing such optimizations at runtime. As future work, we plan to approach such a problem with fast control algorithms.