In this paper, we formulate a joint uplink scheduling, phase shift control, and beamforming optimization problem in intelligent reflecting surface (IRS)-aided systems. We consider maximizing the aggregate throughput and achieving… Click to show full abstract
In this paper, we formulate a joint uplink scheduling, phase shift control, and beamforming optimization problem in intelligent reflecting surface (IRS)-aided systems. We consider maximizing the aggregate throughput and achieving the proportional fairness as objectives. We propose a deep reinforcement learning-based user scheduling, phase shift control, beamforming optimization (DUPB) algorithm to solve the joint problem. The proposed DUPB algorithm applies the neural combinatorial optimization (NCO) technique to solve the user scheduling subproblem, in which a stochastic user scheduling policy is learned by deep neural networks with attention mechanism. Curriculum learning with deep deterministic policy gradient (CL-DDPG) is used in the proposed DUPB algorithm to jointly optimize the phase shift control and beamforming vectors. The knowledge on the hidden convexity of the joint problem is exploited to facilitate the policy learning in CL-DDPG. Simulation results show that, with the maximum aggregate throughput as the objective, the proposed DUPB algorithm achieves an aggregate throughput that is higher than the alternating optimization (AO)-based algorithms. Moreover, the throughput fairness among the users is improved when proportional fairness is used as the objective. The proposed DUPB algorithm outperforms the AO-based algorithms in terms of runtime when the number of reflecting elements is large.
               
Click one of the above tabs to view related content.