Virtual Network Function Placement Optimization With Deep Reinforcement Learning

TLDR

Network Function Virtualization shifts network functions from dedicated hardware to software on general‑purpose machines, but optimal placement of virtual functions is an NP‑hard optimization problem that has traditionally been tackled with heuristic and meta‑heuristic methods. The study proposes using reinforcement learning to learn an optimization policy for virtual network function placement. By extending Neural Combinatorial Optimization to incorporate constraints, the authors train an agent that explores the NFV infrastructure and learns placement decisions that minimize overall power consumption. Experiments show that combining the learned strategy with heuristics yields highly competitive results using relatively simple algorithms.

Abstract

Network Function Virtualization (NFV) introduces a new network architecture framework that evolves network functions, traditionally deployed over dedicated equipment, to software implementations that run on general-purpose hardware. One of the main challenges for deploying NFV is the optimal resource placement of demanded network services in the NFV infrastructure. The virtual network function placement and network embedding can be formulated as a mathematical optimization problem concerned with a set of feasibility constraints that express the restrictions of the network infrastructure and the services contracted. This problem has been reported to be NP-hard, as a result most of the optimization work carried out in the area has focused on designing heuristic and metaheuristic algorithms. Nevertheless, in highly constrained problems, as in this case, inferring a competitive heuristic can be a daunting task that requires expertise. Consequently, an interesting solution is the use of Reinforcement Learning to model an optimization policy. The work presented here extends the Neural Combinatorial Optimization theory by considering constraints in the definition of the problem. The resulting agent is able to learn placement decisions by exploring the NFV infrastructure with the aim of minimizing the overall power consumption. The experiments conducted demonstrate that when the proposed strategy is also combined with heuristics, highly competitive results are achieved using relatively simple algorithms.

References

Page 1

	Year	Citations

Page 1