Q-learning based power control algorithm for D2D communication

Abstract

In this paper, reinforcement learning (RL) based power control algorithm in underlay D2D communication is studied. The approach we use regards D2D communication as a multi-agents system, and power control is achieved by maximizing system capacity while maintaining the requirement of quality of service(QoS) from cellular users. We propose two RL based power control methods for D2D users, i.e., team-Q learning and distributed-Q learning. The former is a centralized method in which only one Q-value table needs to be maintained, while the latter enables D2D users to learn independently and reduces the complexity of Q-value table. Simulation results show the difference of the two Q-learning algorithm in terms of convergence and reward function. In addition, it is shown that through our distributed-Q learning, D2D users not only are able to learn their power in a self-organized way, but also achieve better system performance than that using traditional method in LTE(Long Term Evolution).

References

Page 1

	Year	Citations

Page 1