Unsupervised Learning for D2D-Assisted Multicast Scheduling in mmWave Networks

—The combination of multicast and directional mmWave communication paves the way for solving spectrum crunch problems, increasing spectrum efﬁciency, ensuring reliability, and reducing access point load. Furthermore, multi-hop relaying is considered as one of the key interest areas in future 5G+ systems to achieve enhanced system performance. Based on this approach, users located close to the base station may serve as relays towards cell-edge users in their proximity by using more robust device-to-device (D2D) links, which is essential, e.g., to reduce the power consumption for wearable devices. In this paper, we account for the limitations and capabilities of directional mmWave multicast systems by proposing a low-complexity heuristic solution that leverages an unsupervised machine learning algorithm for multicast group formation and by exploiting the D2D technology to deal with the blockage problem.


I. INTRODUCTION
Millimeter wave (mmWave) band transmissions allow wireless technologies to meet the high data rate requirements of bandwidth-hungry applications, such as extended reality (XR) 1 and multimedia services. This is one of the main advantages of 5G new radio (NR) mmWave small cells, which are considered as one of the main components of future 5G+ networks [1]. Using multicast via point-to-multipoint (PMP) communications in these small cells may help to improve further the spectrum efficiency. Multicasting, which is under consideration by the Third Generation Partnership Project (3GPP) for Release 17 [2] of 5G systems, can provide substantial improvements in terms of system efficiency, user experience, and total network throughput [3], which is a critical feature for ultra-high-speed data transmissions.
Multi-hop relaying schemes are considered as one of the key interest areas in future 5G+ systems. With device-todevice (D2D) communications enabled, users close to the base station (BS) can serve as relays towards cell-edge users in their proximity, interested in the same multicast content, by using more robust D2D links. Hence, several non-adjacent 1 Referring to all mixed real-and-virtual environments and human-machine interactions generated by computer technology and wearables links may be active at the same time, thus enabling concurrent transmissions to achieve better system performance. The authors in [4] demonstrated that mmWave and D2D symbiosis can bring throughput performance improvement up to 2.3 times. Furthermore, concurrent transmissions and D2D-enabled communications in directional multicast systems help to reduce energy consumption, as required by batteryconstrained wearable devices [5], [6].
Multicasting has been widely investigated in traditional omnidirectional communications (i.e., at sub-6 GHz bands), but the design of efficient mmWave multicasting techniques has to account for the limited coverage of directional mmWave communications [7]. Since mmWave is prone to blockages and suffers from high propagation loss, it can severely affect the performance of the multicast link. In case one user in a multicast group suffers from blockage, two options are possible: (i) all users experience this poor channel condition (human blockage takes 15 dB from the signal-to-noise ratio), or (ii) the blocked user is served by the BS through unicast communication. Here we claim that in a single-beam system, D2D links can improve the performances compared to unicast communications in terms of transmission delay, energy consumption, and overall network throughput.
D2D-aided multicasting in mmWave directional systems has been investigated in several recent studies. In [8], an efficient heuristic is designed for multicast data delivery, where D2D multi-hop and concurrent transmissions are jointly exploited to achieve lower energy consumption compared to a series of unicast transmissions. More recently, in [9] and [5], an optimal multicast scheduling problem is formulated, with D2D links and concurrent transmissions, through a mixed-integer non-linear program (MINLP), which is known to be NP-hard. Heuristic solutions with cubic complexity are also designed. A similar approach is proposed in [10], where multicast scheduling jointly exploits relaying and spatial sharing properties of mmWave networks to minimize the overall data delivery time. In [6], an optimal D2D-enabled multicast scheduling policy is proposed by constructing an ILP problem with the goal to minimize energy consumption in mmWave cellular networks.
In practical scenarios, the problem of multicasting with directional beams calls for new strategies that are simpler to compute, compared to the non-polynomial optimal ones used in the above mentioned papers, while still guaranteeing near-to-optimal performance [11]. For this purpose, machine learning approaches can be used to provide close-to-optimal solutions very fast.
In contrast with the previous studies, this paper examines the achievable performance of a D2D-assisted multicast mmWave system by taking into account both the complexity and the directivity-imposed challenges. We use an unsupervised learning scheme to cluster multicast users and define the beam resolution to be swept. Moreover, we refer to D2D communications as a blockage mitigation technique, which is applied when users cannot be served through a multicast link.
Our Contributions. In this paper, we analyze a mmWave communication system wherein the NR BS conveys multicast data to a group of users under coverage by properly generating the transmission beams. For the multicast group formation, we employ the Self-Organizing Map (SOM), one of the traditional unsupervised learning algorithms, to provide the near-optimal result quickly. The complexity of SOM is linear with the number of users and quadratical on the map units' number. Then, we address the possible blockage conditions of multicast users by proposing a D2D-aided multicast scheduling algorithm. The algorithm utilizes D2D transmissions in proximity, thereby ensuring service continuity even if the BS fails to serve devices due to the blockage and outage caused by distance.
This paper is organized as follows. Section II illustrates our system model. Section III describes the multicast subgroup formation as well as the basic idea of D2D-assisted multicasting. Numerical results and related discussion are presented in Sections IV and V. Conclusions are summarised in Section VI.

II. SYSTEM MODEL
This section describes the system model and its core components, including the deployment, antenna, propagation, and blockage models.

A. Deployment Model
We consider a 5G+ NR outdoor deployment, where all user equipment (UE) devices are provided with mmWave modules and served by an NR BS operating in the 28 GHz band. We focus the analysis on the coverage area of a single antenna array, where a group of M UEs of heights h U is uniformly distributed within a sector of 90 • , as illustrated in Fig. 1. UEs are the communication devices carried by people interested in a video streaming service. The NR BS, located at the origin of the coordinates, has height h A and transmits data to multiple users through a multicast mmWave link. The NR BS has a coverage radius R d within which all UEs reliably receive data. However, due to blockage, the connection can be disrupted; we elaborate better on this assumption in the next subsections.

B. Antenna Model
We consider planar antenna arrays at both the NR BS and UEs, and assume that the radiation pattern is represented as a conical area with angle θ, coinciding with the half-power multicast user beamwidth (HPBW) of the antenna array. For a linear antenna array, θ can be computed as provided in [12]: where θ 3db is the angle at which the power output falls below 3 dB from the maximum output power, and θ m is the location of the array maximum, calculated as θ m = arccos(−β/π), given β as the array orientation. We assume that θ m = π/2 for β = 0. The mean gain over the HPBW can be found as in [12]: where the upper and the lower 3-dB points are: and N is the number of antenna elements.

C. Propagation Model
The 3GPP urban micro (UMi) street canyon model [13] is used to model the mmWave propagation. Accordingly, for the line-of-sight (LoS) deployment, the path loss measured in dB is given by: where f is the operating frequency in GHz, and y is the threedimensional (3D) distance between the NR BS and the UE.

D. Blockage Model
In real-life outdoor deployments, 5G NR systems suffer from the presence of mobile obstacles, such as humans and cars, which are often termed as "blockers" [14]. We assume that pedestrians might temporarily occlude the LoS path between the UE and the NR BS; that is, human blockage. In this case, the blockage attenuation B is assumed to be 15 dB.
For the blocked and non-blocked states, the shadow fading margins are represented by M S,B and M S,nB . Then, the path loss in (4) can be rewritten in the linear scale using Ay ς , where A and ς are propagation coefficients: A LoS,nB = 10 2 log 10 f +3.24 M S,nB , ς LoS = 2.1, A LoS,B = 10 2 log 10 f +4.74 M S,B , ς LoS = 2.1.
The blockers are modeled as cylinders with height h B and radius r B [15]. The number of blockers follows a Poisson distribution with density λ B per square meter.
The propagation model, signal-to-noise ratio (SNR), reads as: where p B (y) is the blockage probability at the 3D distance y [15], N 0 is the noise power spectral density, and W is the operating bandwidth.

III. PROPOSED SOLUTION
In this section, we introduce a low-complexity algorithm for the multicast subgroup formation (section III-A), and specify the D2D-based enhancements to account for the blocked users in the directional multicast transmission (section III-B).

A. Multicast Group Formation
We utilize an unsupervised learning algorithm for multicast subgroup formation. Specifically, clusterization is applied to split a multicast group into subgroups served by the same beam. Subgroups are created by using the SOM clustering, one of the most well-known unsupervised neural network models. In detail, the clustering is performed by grouping users according to the similarity of their azimuth angle w.r.t. the reference axis, since we deal with directional transmissions.
The main idea is first to randomly generate weights as the characteristics of the neuron in the architecture, and then to push each of the rows (observations) of the given data into an imaginary space where each row acts as a point. The number of neurons in the map (map size) is an application-or serverspecific parameter. Namely, large maps create many small clusters. In contrast, a map of small size produces fewer but more composite clusters. Once each data holds an imaginary point into the input space, a search for the closest points (users) is performed [16].
The pseudo-code for the classic SOM clustering tailored to the scenario of multicast subgrouping is presented in Algorithm 1. We recall that, at this stage, the SOM algorithm operates by considering multicast users, whereas blockage is accounted for in the next stage (see section III-B).
The following notation is used throughout the paper: • t, the current iteration (epoch); • λ, the time constant that is used to decay the radius and learning rate; • E, the iteration limit, i.e., the total number of iterations the network can undergo; • i, the row coordinate of the neuron grid;  5 Initialize each neuron's weight w ij to a random value; 6 t ← 1; 7 while t < E do 8 Select a random input vector x(t); 9 for all N neurons in the map do 10 Compute Euclidean distance between the input vector x(t) and the neuron's weight vector Track the neuron that produces the smallest distance d; 12 Determine topological neighborhood β ij (t) of the BMU in the map and its radius σ(t): Update the weight vectors of the neurons in the neighborhood of the BMU (including the BMU itself) by pulling them closer to the input vector:

end
• j, the column coordinate of the neuron grid; • d, the distance between a neuron and the best matching unit (BMU); • w, the weight vector; • w ij (t), the weight of the connection between the neurons i, j in the grid and the input vector's instance at iteration t; • x is the input vector; • α(t), the learning rate, decreasing with time in the interval [0, 1] to ensure the network convergence; • β ij (t), the neighborhood function, monotonically decreasing and representing a neuron i, j distance from the BMU, and the influence it has on the learning at step t; • σ(t), the radius of the neighborhood function, which determines how far neighbor neurons are examined in the 2D grid when updating vectors. It is gradually reduced over time. Then, in each training epoch t, Algorithm 1 iterates through the elements of the input vector x. For each element, it finds the closest neuron, then it updates its weights and the weights of all its neighbors in layer w ij . Upon completion of Algorithm 1, neurons converge to final weight values through a competitive learning scheme that adjusts them to resemble nearby winning neurons. This process generates groups of similar neurons in the final map (i.e., multicast subgroups).
Thus, the output of this approach is the number g and size of multicast subgroups M = {M 1 , M 2 , ..., M j , ..., M g }, where M j contains the set of multicast users.
We note that the performance of clustering algorithms depends on the topology of the input data and the clustering goal. The advantages of SOM among other clustering algorithms are that: (i) it does not need a target output and (ii) a priori estimate of the clusters' number to be specified, as in the case of K-means, and (iii) it can deal with ambiguity, assignment of points to multiple clusters (different from, e.g., hierarchical clustering).

B. D2D-aided Multicast Policy
We introduce the D2D-aided multicast scheduling algorithm (MSA), designed to solve possible blockage conditions of multicast group members by means of D2D transmissions from users in physical proximity. The proposed approach, described in Algorithm 2, is carried out in two steps for the selection of: • the transmission parameters of each multicast beam; • the proper D2D transmitter for each user experimenting outage from the multicast transmission. In particular, the output of Algorithm 1 serves as an input for Algorithm 2. First, the heuristic calculates the HPBW (beam resolution) θ required to serve subgroup M j (line 6), which is given by: where multicast users m and m are the two edge users in the group, the two farthest in term of angle between them.
Then, the SNR for multicast subgroup M j is given by: where S thr represents the lower bound of the SNR for the most robust data transmission (i.e., MCS 1). If the SNR at user m is less than the threshold (i.e., S(y m ) < S thr ), D2D communication is activated (see lines 7-13). The second step consists in discovering the multicast users that are closest to the set of users to be served via D2D D (lines [14][15][16] in order to establish direct communication links. We assume that D2D communications are performed concurrently, and that transmission starts as soon as the D2D transmitter receives the service from the NR BS.

C. Proposed Solution Complexity
The computational complexity of the proposed solution is given by: where O(N 2 ) is the complexity of the traditional SOM (Algorithm 1), which is used for multicast group formation. The second and third summands, (O(g |M j |) + O(|D|)), represent the complexity of Algorithm 2 executed after SOM due to (i) two for cycles over all M j multicast groups (here, g is the number of defined groups) and the size of the group, which are embedded loops; (ii) for loop over all identified D2D users.

IV. PERFORMANCE EVALUATION
In this section, we evaluate the effects of the proposed D2D-assisted multicast scheduling via learning and assess the performance of our mmWave system. By default, we assume linear antenna array with 32×4, 16×4, 8×4, 4×4, 2×4, and 1×4 antenna elements. The transmit power is fixed at 33 dBm. The other main simulation parameters, including the inputs for the SOM algorithm implemented in Matlab using the Neural Network Clustering App with map dimension 2x2 (N = 4), are listed in Table I.
The analyzed performance parameters are: • Network throughput (also known as aggregate throughput) is the sum of data rates that are delivered to all terminals in the network.  Fig. 2. Energy consumption as a function of the service area radius. • Energy consumption is the amount of energy used during a given period of time. • Energy efficiency (or efficient energy use) is defined as the achieved network throughput divided by the consumed energy in bit/s/J. We consider the following benchmark solutions to assess the performance of the proposed D2D-assisted multicast scheme: • Directional multicast by using SOM, where the blocked users are served sequentially by narrow unicast links (Multicast). • The BS transmits multicast data through a series of unicast transmissions (Unicast). We initially discuss the impact on the considered performance metrics due to variable service area radius and blockers density, in Figures 2, 3, and 4. Continuous curves refer to 0.1 blockers per square meter density, dashed curves to 0.3. As a general trend, Fig. 2 shows that an expansion in the service area radius corresponds to an increase in the completion time,  which in turn affects the energy consumption for all analyzed schemes. One may also notice that the proposed framework outperforms both benchmark solutions. Specifically, the increase in the blocker density deteriorates the performance of both benchmark schemes, whereas it improves the energy consumption of the proposed approach. This behavior can be explained by the fact that, in our system, blockage brings to the establishment of D2D communications. Therefore, a higher number of blockers leads to a higher number of activated adjacent D2D links and, hence, shorter completion time and lower energy consumption. Curves with similar behavior are reported in Fig. 3 and Fig. 4, where we investigate energy efficiency and network throughput. Both metrics, as expected, decline with the rise in the service area radius. Further, one may observe that in a scenario with higher blocker density, the throughput and energy efficiency reveal better results for our solution. However, this difference becomes smaller for larger areas of interest as we deal with highly directional transmissions.
Finally, we investigate the impact of the neuron map size N on the system performance in Fig. 5. Continuous curves refer to the case with N = 4, and dashed curves to N = 6. We recall that this parameter affects only multicast schemes, leaving the unicast transmission mode without any changes. To this end, energy consumption for the considered schemes is illustrated for the map of 2x2 and 1x6 neurons and a density of 0.3 blockers per square meter. We note that, for the considered service area (sector of 90 • ) and number of multicast users M = 30, the smaller map exhibits visibly better performance, which is mainly explained by the multicast service specific in directional systems. This fact can be inferred in Fig. 5 from the comparison between the unicast and the multicast mode when N = 6. Namely, for a small service radius, splitting multicast users into a high number of subgroups is not efficient as it introduces a delay due to the sequentiality while still not providing the best transmit gain compared to unicast. On the contrary, at larger transmission distances, the multicast approach works better.

V. DISCUSSION
This section discusses the aspects that are out of this paper's scope but are essential in designing mmWave multicast scheduling strategies, thus they will be investigated in future research.
1) Mobility: In this paper, we assume a slow mobility network, such as sport stadiums, concert halls, or urban lowspeed vehicles, where multicast users almost do not move during the transmission period of a 1 Gbit data packet. For networks with high-speed nodes, such as high-speed trains, uninterrupted connectivity can be ensured with features such as beam switching and tracking. Then, in the case of mobility, the beam should track the major part of the multicast subgroup. If a device moves in a different direction, at some time instant, this device will lose the connection with the BS, but a D2D link will be used (as described in Algorithm 2 lines 7-13). This way, the proposed approach can be adapted to the dynamic scenario.
2) Traffic: In real deployments, both multicast and unicast sessions may coexist. In this case, multicast may get priority over unicast traffic due to its service properties. Therefore, efficient radio resource management frameworks are required to handle joint unicast and multicast traffic in mmWave networks.

VI. CONCLUSIONS
In this work, we proposed a low-complexity heuristic solution for D2D-assisted multicasting. The multicast group formation is performed by employing an unsupervised learning approach (i.e., SOM), whereas D2D links are used to cope with the blockage or/and the outage due to the distance. Our developed heuristic considers essential system parameters, which include directional transmission, blockage, minimum required data rate for successful reception, and beam resolution of the antenna array. From the simulation results, we may conclude that SOM works well for directional multicast and that the number of neurons in the learning phase is servicespecific, as well as it depends on the considered area of interest. We also demonstrated that the proposed scheme is good to manage scenarios with high blockers density, and it performs even better when the number of blockers is high.