Deep-Reinforcement-Learning-Based Distributed Dynamic Spectrum Access in Multiuser Multichannel Cognitive Radio Internet of Things Networks

Abstract

Integrating cognitive radio into Internet of Things (IoT) is conducive to reducing spectrum scarcity for large-scale IoT deployment, where a core technology is the design of spectrum access algorithms for effective assignment of spectrum holes. However, due to the partially observable channels and increased number of users in the cognitive radio Internet of Things (CRIoT) network, the secondary users have difficulty avoiding interferences and accessing the spectrum quickly. This study presents a distributed dynamic spectrum access (DSA) algorithm that employs a priority experience replay deep echo state Q-network (PER-DESQN) for CRIoT networks with multiple users and channels. To accelerate the Q-network convergence, we use an echo state network based on the underlying temporal correlation to estimate Q-values. Then, to resolve the Q-value overestimation and improve prediction accuracy, the estimated Q-value and decision action process are trained using a double deep Q-network (DDQN). Moreover, a priority experience replay mechanism that uses the Sum-Tree combined with importance sampling weights is proposed to optimize the DDQN to address the instability of the Q-value resulting from random sampling. As the simulation results demonstrate, the proposed algorithm can make fast and accurate DSA decisions and boost the network channel capacity significantly.

References

Page 1

	Year	Citations

Page 1