: . 第 43 卷第 2 期 he problems of low transmission energy efficiency and low overload capacity in the NORA-QL method, an improved method (I-NORA-QL) suitable for satellite communication networks is proposed. To address the problem of high transmission power consumption, I-NORA-QL improves the learning strategy of Q-learning using global information from satellite broadcasting, the transmitted power of user equipment is used in the construction of the reward function, and the learning rate is designed as a decay function related to the number of iterations of the algorithm. Furthermore, based on the Access Class Barring (ACB), I-NORA-QL realizes the adaptive adjustment of ACB barring factor based on the Q value characteristics and load estimation during the learning process to carry out overload control. Simulation results show that, compared with other existing methods, the proposed I-NORA-QL improved method can effectively reduce the average power consumption of user devices, and significantly improve the throughput under system overload state. Key words: Satellite communications; Random access; Energy efficiency; Overload control; Non-Orthogonal Multiple Access; Q-learning DOI:.20210913001 Citation: YANG Weikang, XU Xiaodong. A NOMA-based Q-learning random access scheme for satellite communications[J]. Journal of Telemetry, Tracking and Command, 2022, 43(2): 25–35.