Distributed Reinforcement Learning based MAC protocols for autonomous cognitive secondary users

Abstract

We consider a decentralized cognitive radio network in which autonomous secondary users seek spectrum opportunities in licensed spectrum bands. We assume that the primary users' channel occupancy follows a Markovian evolution, and formulate the spectrum sensing problem as a Decentralized Partially Observable Markov Decision Process (DEC-POMDP). We develop a distributed Reinforcement Learning (RL) algorithm that allows each autonomous cognitive radio to distributively learn its own spectrum sensing policy. The resulting decentralized sensing policy enables secondary users to non-cooperatively reach an equilibrium that leads to high utilization of idle channels while minimizing the collisions among secondary cognitive radios. Moreover, we propose a decentralized channel access policy that permits controlling, with high accuracy, the collision probability with primary users. Our numerical results validate the robustness of this collision probability control as the sensing noise changes. They also show the efficiency of the proposed learning algorithm in improving the spectrum utilization.

References

Page 1

	Year	Citations

Page 1