Fronthaul-Aware Scheduling Strategies for Next Generation RANs

Next generation Radio Access Networks (RANs) consider virtualized architectures in which base station functions are distributed in different logical nodes, connected through fronthaul (FH) links. To reduce the FH deployment costs and the required FH capacity, operators may install a single FH link shared among multiple cells and exploit key enabling techniques, such as modulation compression, to decrease the data rates over the FH. In shared FH capacity scenarios, it is essential to provide efficient methods to control and optimize the FH resources' utilization with limited impact on the air interface performance. In this paper, we propose and analyze different fronthaul-aware scheduling strategies, leveraging on modulation compression, for multi-cell multi-user scenarios with a shared FH link across multiple cells. We consider holistic approaches based on packet dropping at the PHY layer and postponing of MAC scheduling decisions, combined with the reduction of the modulation order per cell. Additionally, we propose optimization methods for resource allocation and dynamic modulation compression, in which the modulation order and resource block assignment is dynamically optimized per user and slot. We finally evaluate the proposed FH-aware scheduling methods over an end-to-end dynamic 5G NR system-level simulator based on ns-3.


I. INTRODUCTION
Next-generation RANs need to deal with several challenges to provide high spectral efficiency, low power consumption, resource pooling, scalability, and cross-layer interworking. For that, the 3GPP and Open-RAN (O-RAN) initiatives envision hybrid Centralized-RAN (C-RAN) architectures [1], in which the baseband processing of a single cell is disaggregated between different physical baseband entities, while at the same time parts of the baseband processing of multiple cells are collocated in centralized locations. In particular, a 3GPP and O-RAN 5G New Radio (NR) base station (i.e., gNB) consists of a Centralized Unit (CU), one or more Distributed Units (DUs) connected to the CU via a midhaul interface using the so-called Option 2 functional split (PDCP-RLC split) [2], and multiple Radio Units (RUs) that are connected to one DU each through a fronthaul (FH) interface using Option 7.2x functional split (intra-PHY split) [3]. When deploying C-RAN architectures, the main obstacles are tight FH capacity and latency requirements [4]. Such FH requirements are further This work was partially funded by Spanish MINECO grant TEC2017-88373-R (5G-REFINE), Generalitat de Catalunya grant 2017 SGR 1195, and Huawei Technologies Sweden AB. accentuated in 5G NR by using wider channel bandwidths, massive antennas, higher modulation orders, and a larger number of aggregated carriers [5], which contribute to increasing the required FH capacity [6].
On the one hand, to reduce the required FH capacity, FH compression schemes can be used [1]. Among the different techniques envisioned by O-RAN [3], modulation compression is considered a promising option because it allows a dramatic reduction of the required FH capacity without degradation of the signal quality sent over the FH interface and without the need for complex algorithms/schemes [6]. Basically, modulation compression achieves FH capacity reduction by means of decreasing the IQ bitwidth 1 sent over the FH interface based on the modulation order used over the air interface. On the other hand, to reduce the deployment and operational costs, shared FH links across multiple cells are of high interest for mobile network operators [7]. Such scenarios are particularly challenging from the technical point of view, as they result in a shared FH capacity utilization across multiple cells. In essence, given a particular C-RAN architecture, with specific functional splits and a fixed FH topology, it is of utmost importance to provide efficient methods for 5G NR to control and optimize the utilization of FH resources with limited (or no) impact on the air interface performance. This calls for system wide designs and optimization solutions, whose evaluations need to be based on dynamic, end-to-end, multicell simulations. In [8], semi-static modulation compression methods are proposed to optimize the maximum modulation order that is permitted per cell at the long-term, based on cells' statistics, considering a shared FH interface of limited capacity. Differently, in this paper, we focus on dynamic modulation compression methods, to adjust the scheduling and modulation compression decisions to the available FH capacity on a per slot and per user basis. We present, analyze, and optimize different dynamic FH-aware scheduling methods, including holistic methods and optimization methods for resource allocation and modulation compression, in multi-cell scenarios with shared FH interface. Regarding the numerical evaluation, we assess the proposed strategies using a dynamic multi-cell 5G system-level simulator based on ns-3 [9].
The paper is organized as follows. Sec. II describes the system model assumptions, including the analytical charac-terization the FH and air interfaces. Then, Sec. III presents baseline holistic FH-aware scheduling options and dynamic optimization solutions, considering a shared FH capacity constraint. Finally, we describe the simulation scenario and assess the dynamic modulation compression performance in Sec. IV. Sec. V provides some concluding remarks.

II. SYSTEM MODEL
The system model is composed of N cells, whereby each cell serves K n users (UEs) (n = 1, . . . , N ), hence totalling K = N n=1 K n UEs in the system. We assume a hybrid C-RAN architecture with centralized, distributed, and radio units, implementing functional split Option 2 for the CU-DU split and intra-PHY functional split 7.2x for the DU-RU split, as specified by 3GPP [2] and O-RAN [3]. For the CU/DU/RU deployment, we consider the so-called Scenario B, prioritized by O-RAN [10], and in which CUs and DUs of all cells are collocated in a centralized entity (in an edge or regional cloud), while the RUs are located in proprietary cell sites. This way, the high-PHY, MAC and above processing of all the cells are implemented together in the centralized entity, while low-PHY and RF processing of each cell are placed in the RU of each site [11]. A low layer FH interface is used to interconnect RUs with DUs [3]. We assume a star FH topology so that the N RUs share the same FH link, being characterized by a limited capacity of C bits/s. Fig. 1 shows the considered deployment.
We focus on the downlink and adopt the so-called modulation compression technique to reduce the required FH capacity in multi-cell scenarios with shared FH interface. Modulation compression allows lossless IQ data compression (see e.g. [3, §A.5]) whereby modulated data symbols can be represented by a limited number of IQ bits, i.e., the number of bits needed to represent the employed constellation in a data block [6]. In particular, we focus on dynamic FH-aware scheduling methods in which the modulation compression is dynamically optimized per cell and per UE, on a slot basis, considering the actual performances of the FH and air interfaces.
In the shared FH interface, the fronthaul throughput that is needed through all the cells (n = 1, . . . , N ) using functional split 7.2x in a specific slot t (S FH t ) is: where N a t ≤ N is the total number of cells that are active and share the FH interface in slot t, K a n,t ≤ K n is the number of UEs being actually served by the n-th cell in slot t, M n,k,t is the modulation order assigned to the k-th UE in the n-th cell at the t-th slot, N rb n,k,t is the number of Resource Blocks (RBs) allocated for data transmission towards the k-th UE in n-th cell at the t-th slot, N os is the number of OFDM symbols within a slot available for downlink data transmission (which is lower than 14 in 5G NR), N sc is the number of subcarriers within a RB (which is equal to 12 in 5G NR), T is the slot length (in seconds), O mac is the MAC information overhead per UE (in bits) needed to implement intra-PHY splits (including RB Fig. 1. Deployment scenario composed of N cells, each serving Kn UEs, with shared FH interfacing between distributed and radio units (DUs and RUs) with a star FH topology. CUs and DUs are placed together in a centralized entity, and each RU is located in a cell site. assignment information, antenna configuration, beamforming vectors, etc.), and O mc is the signaling overhead (in bits) that accounts for the notification through the FH towards each RU of the modulation order in use for each UE. In (1), single MIMO layer per UE is assumed.
For the air interface, we consider the per-user throughput as a metric of interest. The throughput for the k-th UE in the n-th cell at slot t (S AI n,k,t ) is given by: From previous expressions in Eq. (1) and Eq. (2), it is evident that the scheduling decision (i.e., scheduling of UEs, resource allocation, and modulation order assignment) creates an air-FH trade-off that needs to be optimized appropriately. That is, e.g., increasing the modulation order of a specific UE in a slot, increases the required FH throughput for that slot but also leads to a higher per-user throughput on the air interface. The trade-off is further accentuated in multi-cell scenarios, when multiple cells share the FH capacity (as reflected in Fig. 1) and potentially the air interface channel. In those multicell settings, suboptimal scheduling decisions for each of the cells, from a system-wide perspective, may lead to situations in which certain cells experience congestion (with low per-user throughputs) while other cells have high spectral efficiency and unused spectrum resources (with high per-user throughputs). Cells in the latter condition could thus reduce the assigned modulation orders, which would spare some FH capacity in advantage to cells with more challenging conditions, thus benefiting from a fair quality-of-service across all the UEs in the network. Having this in mind, efficient centralized and dynamic modulation compression scheduling methods for FH compression control should exploit the shared FH capacity while providing quality-of-service guarantees across all UEs.

III. FH-AWARE SCHEDULING STRATEGIES
Typically, M n,k,t (and so, also the Modulation Coding Scheme (MCS)) for the downlink is set based on the Channel Quality Indicator (CQI) reported by the UE. Based on M n,k,t and the RLC buffer level, the MAC scheduler determines the number of RBs that will be assigned to each UE (N rb n,k,t ). For that, the MAC scheduler follows certain pre-established scheduling rules (e.g., proportional fair or round robin, among others). Herein, to meet the shared FH capacity constraint, we start by introducing two baseline FH-aware scheduling methods, namely: "dropping" and "postponing", which are easily implementable, as they do not need modifications in the traditional MAC scheduler operation. Subsequently, enhancing the aforementioned schemes, we propose centralized FHaware scheduling methods that optimize the resource allocation (number of RBs, N rb n,k,t ) and modulation compression (modulation order, M n,k,t ) for all UEs and cells, assuming that all UEs with data on a particular slot have to be scheduled.

A. Dropping packets at PHY
The first baseline FH-aware scheduling option is to implement packet "dropping" at the PHY layer. That is, assume that the modulation order (M n,k,t ) and the RB assignment (N rb n,k,t ) are determined per UE as usual, i.e., disregarding the FH capacity constraint, and the MAC Packet Data Units (PDUs) are prepared and are sent to the PHY layer, as usual. Then, the high-PHY layer of each cell drops the MAC PDUs (including the related control and data) that cannot fit in the available FH capacity (C) by considering the MAC PDUs across all the cells. This implies dropping the full MAC PDU packet(s). Noteworthy, as the UE will not receive any control nor data, no Hybrid Automatic Repeat Request-Acknowledgment (HARQ-ACK) feedback will be generated and, therefore, after a certain timing for HARQ-ACK feedback reception at the gNB has passed, the full HARQ process related to such a drop will be erased. This baseline option acts as a centralized logic that considers the usual RBs/MCS assignment at the MAC scheduler and determines which data (new data and/or HARQ retransmissions) is dropped. The dropping decision is implemented at the high-PHY layer in the DU.

B. Postponing MAC scheduling decisions
The second baseline FH-aware scheduling option is to implement dropping only of the MAC scheduling decision, i.e., not dropping data but "postponing" its transmission/retransmission. In this way, the MAC layer will drop the scheduling decision generated by the MAC scheduler, but data will remain in the RLC queues (new data) or the HARQ buffers (retransmission data) without data dropping. In this case, we assume that the modulation order (M n,k,t ) and the RB assignment (N rb n,k,t ) are determined per UE as usual, disregarding the FH capacity constraint, and then the MAC scheduling decisions are discarded if the assignment does not fit in the available FH capacity limit (C), by considering the already scheduled allocations across all the cells. So, this baseline option acts as a centralized scheduling logic that considers the usual RBs/MCS assignment at the MAC scheduler and determines which scheduling decisions (of new data and/or HARQ retransmissions) are discarded, for which its actual transmission is postponed. The postponing decision is implemented at the MAC layer in the DU.

C. Resource allocation and modulation optimization
We now define an optimization problem whose objective is to dynamically determine the number of resources (N rb n,k,t ) and the modulation order (M n,k,t ) assigned to each active UE of every cell (k = 1, . . . , K a n,t , n = 1, . . . , N a t ), on a given slot t, so that the air interface performance is optimized and the capacity of the shared FH is not exceeded (i.e., S FH t ≤ C). Our objective is to serve all the UEs that have data on each slot and, therefore, a suitable optimization criterion is the maximization of the minimum per-user throughput, by considering all the active UEs in the optimization. As the optimization will be performed on a slot-basis, slot index t will be omitted.
If we focus on maximizing the minimum of the per-user throughput (S AI n,k ) among all active UEs across the cells, subject to a shared FH capacity constraint, the optimization problem for dynamic modulation compression and resource allocation can be set as follows: subject to where O = O mac +O mc is the total overhead for implementing 7.2x intra-PHY split and dynamic modulation compression. Note that the objective of this work is to analyze solutions for configurations in which the shared FH interface is the one that may saturate the system and consequently constraints the system performance. We are not interested in situations in which the system's saturation is given by the air interface, because MAC schedulers for this purpose have been largely investigated in literature. Instead, we want to assess the impact that the FH interface has over the air interface performance. Because of such an assumption and focus, we are omitting in (3) the constraint related to the number of RBs that each cell has available for transmission.
For mathematical tractability purposes, we assume that the product of the modulation order and the number of assigned RBs (i.e., x n,k = N rb n,k M n,k ) is a continuous optimization variable. Hence, if we replace N rb n,k M n,k by x n,k in (3), the optimization problem in (3) is a convex optimization problem on {x n,k }, since it consists of maximizing a concave function subject to an affine (and hence convex) constraint [12]. As a result, it can be solved using standard convex optimization tools. However, the optimal solution can be found in closed form by reformulating the problem, as detailed next.
The optimization problem in (3) can be reformulated equivalently by including a new optimization variable (γ) that encapsulates the minimum of the per-user throughputs: The optimization problem in (4) is a linear programming problem with respect to {x n,k } and γ, since the objective function is a linear function and the constraints are both linear functions on {x n,k } ant γ [12]. Interestingly, the optimal solution to (4) is such that the resulting per-user throughput of all users is the same and the shared FH capacity is fully distributed among the users so that both constraints in (4) are met with equality. Said equivalently, the optimal solution lies in the boundaries of the constraint set, as we elaborate next. In (4), γ is maximized when all the terms 1 T N sc N os x n,k are equal, i.e., 1 T N sc N os x 1,1 = 1 T N sc N os x n,k , ∀n, ∀k. Note that, if such a condition is not met, then we could always tune the solution to increase γ and so maximize the objective function. Therefore, the optimal solution to the optimization problem in (4) satisfies the second constraint with equality and has the following structure: x n,k = T γ N sc N os , ∀n, ∀k.
In addition, the optimal solution to (4) satisfies the FH capacity constraint (first constraint) with equality. Note that in the case that we have a solution for which the FH capacity constraint is not met with equality, then we could always increase all {x n,k }, increasing the objective function γ, and so the original solution was not optimal. Therefore, by including the structure in (5) into the FH constraint in (4) with equality, and elaborating with the expression, we can isolate γ as: Finally, by combining both expressions in (5) and (6), the optimal solution to the optimization problem in (4) results: x n,k * = T C − N a n=1 K a n O N sc N os N a n=1 K a n , ∀n, ∀k.
As the optimization problem in (4) is equivalent to the optimization problem in (3), the solution in (7) is the optimal solution to the original optimization problem in (3) as well. However, in (3), we had two sets of optimization variables (i.e., the modulation order and the RB assignment per UE). Therefore, multiple solutions for the modulation order and RB assignment result to be optimal. Among the multiple solutions that can be found, in this paper we focus on two practical variants: • fix the modulation order and optimize the RB assignment per UE based on (7), or • fix the RB assignment and optimize the modulation order per UE based on (7). In both cases, as shown in (7), the actual number of users that are active (with data in their RLC buffers) ( N a n=1 K a n ) are required, as well as the available FH capacity (C) and the overheads needed to implement the intra-PHY split and the dynamic modulation compression (O = O mac + O mc ).
In case of optimizing the RB assignment per UE, for each UE we can select the modulation order (M n,k ) based on the reported CQI from the k-th UE in the n-th cell, and then select the number of RBs for that UE in such a way that (7) is met. That is, In case of optimizing the modulation compression per UE, we can do the resource selection as usual, allocating RBs across the different UEs in a cell, and then select the modulation order per UE in such a way that (7) is met. That is, Note that for the sake of numerical tractability, the optimization problem in (3) has been solved by assuming that the optimization variables are continuous. Thus, there is a need to quantize the derived solutions into practical values. A simple quantization process is to take the floor over the possible RB numbers and modulation orders.

D. Modulation order limits
As analyzed in [6], [8], an efficient way to address the air-FH trade-off is by limiting the maximum modulation order that is permitted per cell. This per-cell modulation order limit can still be used even if we adopt dynamic methods, as the ones proposed in the previous sections (dropping, postponing, RB optimization, MCS optimization). In fact, it can be of interest in dropping and postponing strategies, as it may produce less dropping of packets and postponing of scheduling decisions due to the inherent limitation of the required FH throughput with lower modulation orders.

IV. SIMULATION RESULTS
For the performance assessment, we use the 5G-LENA module of ns-3 [9] with the 3GPP spatial channel model developed in [13], compliant with TR 38.901, and the NRbased PHY abstraction model proposed in [14].

A. Deployment scenario
We use a typical hexagonal site deployment. Each site has 3 cells (3 RUs), with 3 uniform planar antenna arrays, each covering 120º in azimuth. We deploy three sites, leading to a total of 9 cells (N = 9 RUs). We use an Urban Micro scenario, characterized by an inter-site distance of 200 m, RU antenna height of 10 m, and 30 dBm transmit power. K n = 10 UEs are randomly deployed per cell (∀n). The deployment assumes a frequency reuse of 1; so that all cells transmit in the same frequency band. The operational band is in the 2 GHz region, with 100 MHz channel bandwidth per cell and a PRB overhead of 0.04 (typical of NR). Numerology 1 is used, i.e., 30 KHz subcarrier spacing. The duplexing mode is TDD, with dynamic downlink-uplink slot structure. Therefore, the slot length is T = 0.5 ms, and the number of RBs and symbols available for downlink data within it are: 266 RBs and N symb = 12 symbols (2 OFDM symbols are left for PDCCH and PUCCH control channels), ∀n, ∀k. The antenna array configuration consists of 5 × 2 directional antenna elements at RUs and 1 isotropic antenna element at UEs. The base MAC scheduler uses Round Robin scheduling metric to initially sort the UEs and we configure 1 RB Group = 4 RBs.
Regarding the FH, we assume the scenario shown in Fig. 1. In this way, CUs/DUs are centralized, giving service to all the cells (RUs) of the deployment scenario under consideration, following functional split Option 7-2x (DU-RU). Due to the shared FH topology, the FH capacity is shared among the 9 RUs. We assume an available FH capacity of C Gbps, the value of which will be varied through the simulations, with C = 1 Gbps and C = 0.5 Gbps. For the required FH throughput, we use the expression in (1) with O mac = 100 Kbits and O mc = 100 bits.

B. Traffic model and KPIs
We use File Transport Protocol (FTP) Model 1, as recommended by 3GPP and described in [15,Sec. A.2.1.3.1]. This traffic model can be tuned to capture a range of realistic traffic types [16]. Among these, we have selected a widely used and predominant traffic in the Internet, i.e., video traffic. YouTube video traffic is characterized by a fixed file size of 50 KBytes and λ = 50 files/second are generated according to a Poisson process, so that in average one file is generated every 20 ms [16]. The simulation duration is 10 seconds. We evaluate the performance of the FTP traffic model over UDP transport protocol. As for the link layer, we consider UDP over RLC-UM.
The key considered end-to-end performance metrics in this study are "user-perceived throughput" (UPT) and "latency". Both are measured at the IP layer. The user-perceived throughput is computed as the fraction of received bytes per file over the receive time duration to complete the file transfer. The latency is computed as the sum of the delay to receive all packets that compose the file.

C. End-to-end results
In the end-to-end evaluation, for each available FH capacity (C), we assess the impact of using different FH-aware scheduling strategies for dynamic modulation compression. We compare the following methods:  ) assignment per active UE, as per (9). In all cases, we consider two subvariants of the methods: 1) no modulation order per cell limitation (i.e., each cell can use up to 256QAM) and 2) limiting the modulation order per cell to 16QAM, which are depicted in the legends of the figures as +256QAM and +16QAM, respectively. In the case of C = 1 Gbps, we can observe that Postpone, Mcs, and Rb strategies perform pretty similar in terms of UPT and latency (see Fig. 2). Interestingly, they outperform the Drop strategy when no modulation order limit is imposed (i.e., +256QAM in figures), which is the simplest implementation strategy but has the drawback that it drops and loses data packets, as shown by the tail of the UPT and latency CDFs. In case the 16QAM modulation order limit is imposed, all the strategies perform similarly, because, for C = 1 Gbps, the endto-end performance is limited by the air interface (available RBs) and not the FH.
These conclusions are not valid thought, if we have a FH of lower available capacity, as with C = 0.5 Gbps, shown in Fig. 3. In this case, we can see a clear advantage of the FH- aware scheduling strategies that optimize the RBs and MCS allocation (Rb and Mcs) over the other strategies (Drop and Postpone) when no modulation order limit is imposed (i.e., +256QAM in figures). In particular, Rb and Mcs optimization strategies provide a significant improvement gain over the Postpone strategy (see Fig. 3(a)) in all percentiles of the UPT and latency. Among Rb and Mcs strategies, we can observe that the Mcs strategy slightly outperforms the Rb strategy in the 5%tile of the UPT, because when we reduce the modulation order, we are making the transmission more robust towards channel impairments and interference conditions (the probability of error is reduced). On the other hand, the Rb strategy outperforms the Mcs strategy in the 50%tile UPT and 95%tile UPT (see Fig. 3(a)), because it allows to distribute all the available air interface RBs more effectively, enabling to serve a larger quantity of data. Interestingly, in case a modulation order limit is imposed (+16QAM in figures), the UPT values are upper bounded but the 5%tile and 50%tile of the UPT can be improved, as compared to its counterparts with no MCS limit, because the reduction of the modulation order improves the transmission reliability.

V. CONCLUSIONS
This paper has proposed dynamic holistic and optimized FH-aware scheduling strategies for hybrid C-RAN architectures with multiple cells sharing a single FH link, assuming that modulation compression is used to reduce the amount of data that is sent over the FH interface. The proposed optimized methods allow controlling the FH resources' utilization without exceeding the available FH capacity while optimizing the per-user experienced throughput of all the users in the system. We have then evaluated the proposed FH-aware scheduling methods in an end-to-end, dynamic, multi-cell simulation built on the ns-3 5G NR module. Simulation results in a multicell and multi-user scenario have shown that the proposed optimized solutions outperform the baseline strategies, providing user-perceived throughput gains in conditions of a tight FH capacity. In situations with a more relaxed FH capacity, simulations show that the baseline holistic method that follows a postponing approach is already sufficient from a system-level performance perspective.