Concepedia

Publication | Open Access

Distributed Learning Based Handoff Mechanism for Radio Access Network Slicing with Data Sharing

19

Citations

10

References

2019

Year

TLDR

Network slicing enables future mobile networks to meet diverse QoS demands, but its UE‑BS‑NS three‑layer association makes handoff highly complex and beyond conventional policies. This work proposes LESS, a multi‑agent reinforcement‑learning based smart handoff policy with data sharing, to lower handoff cost while preserving user QoS in RAN slicing. LESS consists of LESS‑DL, a distributed Q‑learning algorithm with a reduced action space for selecting target base stations and network slices, and LESS‑DS, a limited‑data sharing mechanism that updates users’ Q‑values to improve decision accuracy. Simulation results demonstrate that LESS markedly reduces handoff cost compared to traditional non‑learning handoff policies in typical scenarios.

Abstract

Network slicing (NS) has been identified as a fundamental technology for future mobile networks to meet extremely diverse communication requirements by providing tailored quality of service (QoS). However, due to the introduction of NS into radio access networks (RAN) forming a UE-BS-NS three-layer association, handoff becomes very complicated and cannot be resolved by conventional policies. In this paper, we propose a multi-agent reinforcement LEarning based Smart handoff policy with data Sharing, named LESS, to reduce handoff cost while maintaining user QoS requirements in RAN slicing. Considering the large action space introduced by multiple users and the data sparsity problem due to user mobility, LESS is designed to have two components: 1) LESS-DL, a modified distributed Q-learning algorithm with small action space to make handoff decisions; 2) LESS-DS, a data sharing mechanism using limited data to improve the accuracy of handoff decisions made by LESS-DL. The proposed LESS mechanism uses LESS-DL to choose both the target base station and NS when a handoff occurs, and then updates the Q-values of each user according to LESS-DS. Numerical results show that in typical scenarios, LESS can significantly reduce the handoff cost when compared with traditional handoff policies without learning.

References

YearCitations

Page 1