STDPG: A Spatio-Temporal Deterministic Policy Gradient Agent for Dynamic Routing in SDN

Abstract

Dynamic routing in software-defined networking (SDN) can be viewed as a centralized decision-making problem. Most of the existing deep reinforcement learning (DRL) agents can address it, thanks to the deep neural network (DNN) incorporated. However, fully-connected feed-forward neural network (FFNN) is usually adopted, where spatial correlation and temporal variation of traffic flows are ignored. This drawback usually leads to significantly high computational complexity due to large number of training parameters. To overcome this problem, we propose a novel model-free framework for dynamic routing in SDN, which is referred to as spatio-temporal deterministic policy gradient (STDPG) agent. Both the actor and critic networks are based on identical DNN structure, where a combination of convolutional neural network (CNN) and long short-term memory network (LSTM) with temporal attention mechanism, CNN-LSTM-TAM, is devised. By efficiently exploiting spatial and temporal features, CNN-LSTM-TAM helps the STDPG agent learn better from the experience transitions. Furthermore, we employ the prioritized experience replay (PER) method to accelerate the convergence of model training. The experimental results show that STDPG can automatically adapt for current network environment and achieve robust convergence. Compared with a number state-of the-art DRL agents, STDPG achieves better routing solutions in terms of the average end-to-end delay.

References

Page 1

	Year	Citations
Long Short-Term Memory Sepp Hochreiter, Jürgen Schmidhuber Neural Computation	1997	93.8K
MizAR 60 for Mizar 50 DROPS (Schloss Dagstuhl – Leibniz Center for Informatics)	2023	73.5K
Human-level control through deep reinforcement learning Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Nature Artificial IntelligenceEngineeringDeep Reinforcement LearningReinforcement Learning (Educational Psychology)Computer Science	2015	28.8K
Learning from Imbalanced Data Haibo He, Edwardo A. Garcia IEEE Transactions on Knowledge and Data Engineering EngineeringMachine LearningCritical ReviewText MiningData Science	2009	9.2K
Prioritized Experience Replay Tom Schaul, John Quan, Ioannis Antonoglou, arXiv (Cornell University) Artificial IntelligenceEngineeringMachine LearningSequential LearningEducation	2015	2K
The road to SDN Nick Feamster, Jennifer Rexford, Ellen Zegura ACM SIGCOMM Computer Communication Review EngineeringSoftware EngineeringSoftware Defined SecurityNetwork ConvergenceInternet Of Things	2014	876
A Survey of Machine Learning Techniques Applied to Software Defined Networking (SDN): Research Issues and Challenges Junfeng Xie, F. Richard Yu, Tao Huang, IEEE Communications Surveys & Tutorials Autonomous NetworkInternet Traffic AnalysisEngineeringMachine LearningMachine Learning Algorithms	2018	671
Violin Plots: A Box Plot-Density Trace Synergism Jerry L. Hintze, Ray D. Nelson The American Statistician MusicComputational MusicologyMusical AnalysisArtsMusic Processing	1998	444
Experience-driven Networking: A Deep Reinforcement Learning based Approach Zhiyuan Xu, Jian Tang, Jingsong Meng, Artificial IntelligenceAutonomous NetworkEngineeringMachine LearningSequential Learning	2018	406
Simplifying the synthesis of internet traffic matrices Matthew Roughan ACM SIGCOMM Computer Communication Review Realistic Traffic MatricesNetwork ScienceEngineeringTraffic FlowFirst Step	2005	199

Page 1