Traffic Scheduling in Non-Stationary Multipath Non-Terrestrial Networks: A Reinforcement Learning Approach
In Non-Terrestrial Networks (NTNs), where LEO satellites and User Equipment (UE) move relative to each other, Line-of-Sight (LOS) tracking,and adapting to channel state variations due to endpoint movements are a major challenge. Therefore, continuous LOS estimation and channel impairment compensation are crucial for a UE to access a satellite and maintain connectivity. In this paper, we propose a Actor-Critic (AC)-Reinforcement Learning (RL) framework for traffic scheduling in NTN scenarios where the channel state is non-stationary due to the variability of LOS, which depends on the current satellite elevation. We deploy the framework as an agent in a Multi-Path Routing (MPR) scheme where the UE can access more than one satellite simultaneously to improve link reliability and throughput. We study how the agent schedules traffic on multiple satellite links by adopting the AC version of RL. The agent continuously trains based on variations in satellite elevation angles, handoffs, and relative LOS probabilities. We compare the agent retraining time with the satellite visibility intervals to investigate the effectiveness of the agent’s learning rate. We carry out performance analysis considering the dense urban area of Chicago, where high-rise buildings significantly affect the LOS. The simulation result show how the learning agent selects the scheduling policy when it is connected to a pair of satellites. The results also show that the retraining time of the learning agent is up to 0.1 times the satellite visibility time at certain elevations, which guarantees efficient use of satellite visibility.