User-aware centralized resource allocation in heterogeneous networks

In the last two years, in Europe, 5G networks and services proliferated. The integration of 5G networks with other radio access networks is considered one of the key enablers for matching the challenging 5G Quality of Service requirements. In particular, the integration with high throughput satellites promises to increase the network performances in terms of resilience and Quality of Service. The present work addresses this problem and presents a user-aware resource allocation methodology for heterogeneous networks. Said methodology is articulated in two-steps: at first, the Analytical Hierarchy Process is used for deciding the network over which traffic is steered and, then, a Cooperative Game for allocating resources within the network is set up. Simulations are presented for validating the proposed approach.

heterogeneous access network) is one of the key enablers for the 5G requirements. To empower such kind of networks it is required to simultaneously connect and orchestrate different ICT resources belonging to different RATs. By doing so, it is possible to send end-users' traffic on the most suitable RATs i.e. on the networks which best match the user requirements. To solve this problem, it is necessary to configure the User Equipment (UE) so that the resources provided by different nodes, characterized by different access technologies, can be exploited. This latter aspect is referred to as the multiconnectivity problem.
The goal of the present work is to present an approach to solve the multi-connectivity problem considering both the user requirements and preferences and the status of the different RATs. For what concerns the underlying functional architecture the reader can refer to [4], [5].
The remainder of this paper is organized as follows: Section II presents a review of the literature concerning the multi-connectivity problem; Section III describes the proposed approach whose mathematical formulation is presented in Section IV; in Section V simulations are discussed and, finally, in Section VI, results are wrapped up and future developments discussed.

II. STATE OF THE ART
The multi-connectivity problem can be addressed from the architectural or the algorithmic point of views. Although the focus of this paper is on the second aspect, in the following a concise review of the architectural issues and solutions discussed in the literature will be presented, for sake of completeness. For a complete overview of multi-connectivity architectures, the reader can refer to [6].
From an architectural point of view, multi-connectivity requires to address two main problems: (i) multi-RAT integration, i.e. choosing the layer at which the integration is performed, and (ii) multi-connectivity management, i.e. the decision relative to the coordination and cooperation between RATs. Concerning the first problem, the integration can be performed at three different layers [7]: • At the application layer. This solution is easier to implement but it is application-dependent and, for highly dynamic scenarios, may lead to sub-optimal 5G AgiLe and fLexible integration of SaTellite And cellulaR). This work reflects only the author's view. The EU Commission, the Research Executive Agency and the IITP are not responsible for any use that may be made of the information it contains.
solutions since it does not consider the networks status; • At the core-network layer. This solution is based on interworking between core networks: the RAT selection is made considering operators' policy for network selection, but the overall network selection decision remains in control of the UE (having local knowledge); • At the RAN layer. This solution envisages a coordination between the RATs using dedicated interfaces thus enabling more dynamic Radio Resource Management (RRM) mechanism and improving overall system and user performances (feedbacks from UEs and operators can be also considered); Concerning the multi-connectivity management, three main approaches can be adopted [8]: • User-centric: the measurements (e.g. the signal to noise ratio) performed at the UE level drive the handover to a different RAT; • RAN-assisted: the decision relative to the handover is performed based on measurements at the UE level and on information about the RATs status; • RAN-controlled: UEs provide measurements to their local radio environment and, based on these and on the knowledge of the network status, the RAN takes the decision. Due to the mentioned characteristics, RAN-based integration and RAN-controlled handover decisions have been adopted in [9], [10], [11]. For such solution, the main aspect to address consists in the functional split among the RAN components. In other words, the problem consists in deciding if the RAN functionalities should be placed in a central unit or in the distributes units. This choice drives the definition of the multi-connectivity algorithms. In [10] the main RAN deployments (i.e., Non-Centralized, Co-Site, Centralized and Shared-RAN) have been discussed. The multi-connectivity problem has been modelled as a RAT selection problem [11], [12], [13], [14]. The problem consists in the selection of the most appropriate access network, with characteristics able to satisfy the 5G requirements, over which sending the traffic (traffic steering problem). Several approaches have been proposed to solve the problems of traffic steering and network selection, ranging from Multiple Attribute Decision Making (MADM) [15], [16] and fuzzy logic [17] to game theory, Markov Decision Processes (MDPs) [18] and Reinforcement Learning [19] applied also to other network resource allocation problems in the context of SDN and VNF [20], [21], [22].
Concerning game theoretic approaches, the RAT selection problem is usually modelled as an adversarial game between users [11], [23], [24] or between network operators [25], [26]. These approaches pay the so-called price of anarchy i.e. the absence of collaboration leads to a loss of performances in the network.

CONNECTIVITY
In this work, RAN-Controlled approach and RAN-Based Integration are considered. The RAN-Based Integration allows the cooperation and coordination among RATs while the RAN-Controlled Approach allows to avoid suboptimal solutions typical of User-Centric approaches. To support these approaches, the considered architecture consists of a Central Unit (CU), containing the Control Plane cooperation and coordination functionalities together with the User Plane integration and switching functionalities. Such architecture is referred to as Centralized Deployment [10] and is depicted in Fig. 1 (where gNB stands for Next Generation Base station).
Concerning the network selection problem, the proposed approach is user-aware: it envisages a personalization system whose functionalities are placed in the Core Network (CN). Such system consists of a repository of the historical connections data for each user and a processing block that, based on the stored information, is able to synthetize, for each user (or clusters, i.e. homogeneous groups of users), a set of user characteristics in terms of user's Connection Preferences. These Connection Preferences specify the user's needs not in terms of additional QoS constraints, but in terms of personal user's preferences expressed over a set of non-standardized parameters as battery consumption, connection costs, mobility, etc. The stored information is updated based on user's feedback at the end of each connection and are used for resources allocation.
The multi-connectivity problem, when considering the user requirements and network status, requires solving two distinct (but related) problems: • Network selection process, aimed at assigning users, or a portion of their connection traffic, with one or more access networks. The selection process is typically performed statically, and the usersnetworks association is maintained during the connection life cycle; • Bandwidth allocation process, aimed at allocating the available resources while matching users' requirements. This problem is solved dynamically, in centralized/distributed and cooperative or noncooperative fashion trying to satisfy the users' resources request using the actual networks available resources and optimizing the network performances in a load balancing fashion [27]- [30]. To solve these two problems, the proposed methodology combines the Analytical Hierarchy Process (AHP) and a Cooperative LQ Difference Game with the following roles: • AHP solves the network selection process by evaluating the affinity of each network with respect to the users' connections [31] and considering the different network characteristics and the actual users' QoS; • Cooperative Difference Game solves the networks' resources allocation problem by considering the networks as cooperative agents which optimally distribute their resources between users considering the output of the network selection process. The resulting approach is thus a model driven, dynamic algorithm with a prioritization of objectives and performances based on the users' preferences and networks' characteristics. The two stages are not solved simultaneously since the AHP stage involves non real-time data (users and networks information) typically stored in the CN by the network operators and updated with a time greater than a connection duration, in the other hand the differential game stage involves real-time data in his computations, typically stored in RAN and updated in real-time accordingly to the connection requirements and the networks conditions. In what follows, the mentioned methodologies will be briefly described.

A. AHP (Analytic Hierarchy Process)
The AHP is a methodology used to perform multi-criteria decisions in complex environments [33]. This is achieved by means of hierarchical structure consisting in three layers: (i) the target objective, (ii) several evaluation criteria and (iii) several possible options. More in detail, the AHP can be summarized in the following steps: i. Definition of the decision problem. The hierarchical structure is defined, i.e. the target objective, the evaluation criteria (or attributes) and the available options are identified. ii. Computation of the vector of attribute scores . At first, the evaluation criteria are pairwise compared by domain experts using the Saaty's scale ( Table 1). The results of such comparison are then stored in a × matrix . The elements of the Attribute scores vector , where is the -th criteria, can be computed as: where ̅ is the normalized value of the i-th entry of matrix . The column vector contains the information about the importance of each attribute in the decision process. iii. Consistency check. In [33] it has been proposed a procedure to avoid inconsistencies (or judgement errors) in the comparison matrix . In particular, the matrix is said to be consistent if and only if the greater eigenvalue λ of is equal to . To evaluate the degree of inconsistency of the comparison matrix, it is possible to define the Consistency Index (CI) as By considering this index together with the Random Consistency Index (RI), computed as the average of CIs associated to randomly generated comparison matrices, it is possible to define the Consistency Ratio (CR): If the CR is smaller or equal to 0.1 the inconsistency is acceptable, otherwise the subjective judgement needs to be reconsidered. iv.
Computation of options' scores vector . For each criterion , the option score matrix is a × matrix whose entries ik are computed by comparing (e.g., according to Table 1) option and option , with respect to the -th criterion. The consistency of matrices can be checked with the procedure described in step iii. The score of the -th options with respect to the -th criteria is computed as where ̅ is the normalized value of the jk-th entry of  The evidence favors one element with the highest possible order of affirmation

D. Linear Quadratic Difference Game
The differential or difference game [34] is a class of dynamic games that represent a generalization of the optimal control theory. The main difference between differential games and optimal control is the presence of several independent control actions that drive the state evolution. These inputs are managed by different players and there is not a single objective function as each player has its own. Differential games are characterized by: • a set of differential equations, to model the state evolution of the system of interest; • the state and control constraints, to model physical and performance constraints; • the objective functions, to model the target performances. Differential games can be cooperative, where all the players have the same objective function, or non-cooperative, where each player has his own objective function. In this work the former class of games are considered. That is, players coordinate their strategies in view of optimizing a collective objective function: selfish behaviours are rewarder less than collaborative ones.
The problem of interest can be modelled as a Discrete-Time Linear Quadric (LQ) differential game [35] whose general formulation for two players is: where is the final time, ( ) is the state of the dynamical system to be controlled (with 0 and its initial and final values), 1 and 2 are the vectors of variables which can be manipulated by two players, models disturbances and , and . are weight matrices. The matrices , 1 and 2 are the system dynamic matrix and the input matrix of the two players, respectively. The values ̅ 1 , ̅ 2 and ̅ are the upper bounds of the variables , 1 and 2 , respectively.

IV. PROBLEM FORMULATION
In this section, the mathematical formulation of the proposed methodology is presented. In Section A the network selection problem is formalized as an AHP model whereas, in Section B, the user-aware bandwidth allocation problem is formalized as a cooperative LQ difference game.

A. AHP-based access network ranking
As described in Section III, for using the AHP it is necessary to define the target objective, the evaluation criteria (attributes) and the available options. The target objective, in the considered scenario, is the selection of the most suitable network considering user preferences. The criteria against which access networks can be evaluated can be defined in terms of (sub) sets of intrinsic network characteristics (e.g., cost, security, packet delay, packet loss, bandwidth, coverage). In heterogeneous networks the variance between the values of these characteristics can be significant. For example, satellite access networks are typically characterized by high security and privacy levels [36], delays, costs, coverage and bandwidth while terrestrial access networks offer low costs and delays with limited security, coverage and bandwidth. The options, in a heterogeneous network, are represented by all the available access networks associated to given users. Such networks are characterized in terms of the abovementioned intrinsic characteristics.
Based on the defined hierarchical structure, it is possible to apply the steps defined in Section III. In particular, the two comparison matrices can be computed and stored by the network operators or the service providers, to be used during the connections. The outcome of this first step is thus a matrix whose generic entry represent the score assigned to the choice of associating user/cluster to network . A further refinement of such static assignment considers users' past interactions: implicit and explicit users' feedbacks can be used update the comparison matrices. This leads to the definition of a Quality of Experience (QoE) Management framework.

B. Bandwidth allocation as a Cooperative Difference Game
The second step of the proposed methodology consists in deciding how to allocate the available resources within the different access networks. Based on the static association between access networks and users, a cooperative game is set up to dynamically solve the resource allocation problem. Such allocation is performed considering users' (or services) requirements and the networks' status.
In the assumption that networks do not have full queues (but only limited resources in terms of bandwidth), since the backhaul capacity of each access network is assumed to be greater to its allocated bandwidths.
The cooperative difference game can be formulated as a particular case of (5)-(11), the main differences are about the final state value constraint, since we are interested in minimize that value and not to have a predefined final value, and considering only linear contribution of the control variables in the cost function, this because the aim is to maximize the usage of the network with higher grade using the result of the AHP as weight matrices R with a minus sign to maximize in the min formulation, without the needs to minimize the assigned resources, to avoid network under use implying queue growth.
The resulting LQ formulation is: where is the final time in the optimization window, ( ) is the state of the evolving state of the queues (with 0 initial values) and , and vectors are the variables which can be manipulated by the three players (i.e., the networks). In particular, is the bandwidth assigned from the network , and represents the bandwidth of network assigned to cluster ; and can be defined in the same way. The traffic arriving to the clusters' queues is modelled as a disturbance and its evolution depends by the service relative to the cluster itself. The values x ̅ , u ̅ , v ̅ , w ̅ are the upper bounds of the respective variables. The matrices (.) are extracted by the matrix = [ , , ], output of the first step of the proposed methodology (see previous section). The weight matrix allows to associate to given user higher relevance in the cost function allowing to capture different contractual positions.
The proposed two-step methodology is depicted in Fig. 2. The matrix is periodically updated by the AHP processing block (Fig. 2  The time step is assumed to be 1 unit of time while the clusters' queues are assumed to start with 2 Mb stored. The bandwidth limits for the considered networks are: 20Mpbs for Satellite, 10 Mbps for 5GTerrestrial and 6 Mbps for WLAN. The resulting AHP option score (i.e., networks ranking) according to the attribute priority and the attribute score is presented in Fig. 5. The clusters are characterized by different services, each service requires a different amount of data traffic; in Fig. 5 the traffic required by the three clusters is depicted.  The network ranking in Fig. 4 capturing the clusters' preferences and the clusters' demand in terms of bandwidth, depicted in Fig. 5, drive the cooperative LQ differential game described in Section IV. In particular, the networks cooperate in order to match clusters' demand while considering their preferences. This cooperation is possible in virtue of the chosen multi-connectivity integration and management strategies (i.e. RAN-based integration and RAN-Controlled management, see Section II).
Simulations show that all the bandwidth of the satellite network is allocated to C1 and, when its demand exceeds the satellite bandwidth limit, traffic is steered on the 5GTerrestrial. This result is in line with the AHP ranking: C1 prefers the satellite and, in second instance, the WLAN which, however, is congested. The bandwidth assigned to the cluster 2 is shown in Fig. 6. Simulations show that the 5G-Terrestrial network is the most used; when the latter is congested, the traffic is steered on the satellite network since the WLAN is saturated by C3. This result is in line with the AHP ranking: C2 prefers the WLAN which, however, is completely used by C3 (for which it represents the first choice) and thus traffic is steered on the 5G-Terrestrial network which represent C2's second choice. The bandwidth assigned to the cluster 3 is shown in Fig. 7. Due to the strong preference of C3 for the WLAN network, its traffic is mostly sent on such network; when the WLAN has no bandwidth to assign, the 5G-Terrestrial is used which represents the C3's second choice. It is possible to note that, when the Satellite has free bandwidth and the WLAN is congested, the proposed algorithm decision is to reallocate the 5G-Terrestrial bandwidth previously allocated to C2 in favour to C3; C2's demand is then satisfied with the satellite bandwidth which, instead, represent the least favourite option for C3. The aggregate allocated bandwidth of each network is depicted in Fig. 8. As shown, the networks' bandwidth limits are respected, and only the 5G-Terrestrial isn't fully allocated. This is since the 5G-Terrestrial is the only network that isn't the first choice of some cluster. The observed behaviour shows that the clusters with a great preference for a specific network is favoured against the undecided clusters.

VI. CONCLUSION
In this paper a user-aware allocation problem in heterogeneous networks have been addressed. The reference scenario is of great interest as it allows to fully exploit the 5G potentialities and to achieve a seamless integration of different access networks. The proposed approach is structured in two phases: at first, for each (cluster of homogeneous) user, the available access networks are ranked according to its preferences and network characteristics; then, resources are allocated within each network, in a dynamic fashion, by setting up a cooperative LQ difference game in which the access networks cooperate to maximize the heterogeneous network performances while matching user preferences (and requirements). The framework described can be easily extended to take in consideration users' feedbacks to adapt the static network assignment. The authors are currently working on more complex scenarios involving larger number of clusters.