Learning-Based Traffic Scheduling in Non-Stationary Multipath 5G Non-Terrestrial Networks
In non-terrestrial networks, where low Earth orbit satellites and user equipment move relative to each other, line-of-sight tracking and adapting to channel state variations due to endpoint movements are a major challenge. Therefore, continuous line-of-sight estimation and channel impairment compensation are crucial for user equipment to access a satellite and maintain connectivity. In this paper, we propose a framework based on actor-critic reinforcement learning for traffic scheduling in non-terrestrial networks scenario where the channel state is non-stationary due to the variability of the line of sight, which depends on the current satellite elevation. We deploy the framework as an agent in a multipath routing scheme where the user equipment can access more than one satellite simultaneously to improve link reliability and throughput. We investigate how the agent schedules traffic in multiple satellite links by adopting policies that are evaluated by an actor-critic reinforcement learning approach. The agent continuously trains its model based on variations in satellite elevation angles, handovers, and relative line-of-sight probabilities. We compare the agent’s retraining time with the satellite visibility intervals to investigate the effectiveness of the agent’s learning rate. We carry out performance analysis while considering the dense urban area of Paris, where high-rise buildings significantly affect the line of sight. The simulation results show how the learning agent selects the scheduling policy when it is connected to a pair of satellites. The results also show that the retraining time of the learning agent is up to 0.1𝑡𝑖𝑚𝑒𝑠 the satellite visibility time at given elevations, which guarantees efficient use of satellite visibility.