In this paper, a multi-agent deep reinforcement learning method was adopted to realize cooperative spectrum sensing in cognitive radio networks. Each secondary user learns an efficient sensing strategy from the… Click to show full abstract
In this paper, a multi-agent deep reinforcement learning method was adopted to realize cooperative spectrum sensing in cognitive radio networks. Each secondary user learns an efficient sensing strategy from the sensing results of some of the selected spectra to avoid interference to the primary users and to coordinate with other secondary users. It is necessary to balance exploration and exploitation in the learning process when using deep reinforcement learning methods, helping explain that upper confidence bound with Hoeffding-style bonus has been adopted in this paper to improve the efficiency of exploration. The simulation results verify that the proposed algorithm, when compared with the conventional reinforcement learning methods with $\varepsilon $ -greedy exploration, is much easier to achieve faster convergence speed and better reward performance.
               
Click one of the above tabs to view related content.