Online Deep Reinforcement Learning for Computation Offloading in Blockchain-Empowered Mobile Edge Computing

TLDR

Offloading computation‑intensive blockchain and data‑processing tasks to edge or cloud resources is promising for mobile edge computing, yet traditional auction‑based or game‑theory methods cannot adapt to changing environments and deep‑reinforcement‑learning approaches suffer from slow convergence due to high‑dimensional action spaces. This work introduces a model‑free deep‑reinforcement‑learning framework that jointly optimizes mining and data‑processing offloading decisions in blockchain‑empowered mobile edge computing. The authors formulate the problem as a Markov decision process, employ deep‑reinforcement‑learning to capture dynamic conditions, and augment exploration with an adaptive genetic algorithm to accelerate convergence without sacrificing performance. Experimental results show the proposed algorithm converges rapidly and outperforms three benchmark policies.

Abstract

Offloading computation-intensive tasks (e.g., blockchain consensus processes and data processing tasks) to the edge/cloud is a promising solution for blockchain-empowered mobile edge computing. However, the traditional offloading approaches (e.g., auction-based and game-theory approaches) fail to adjust the policy according to the changing environment and cannot achieve long-term performance. Moreover, the existing deep reinforcement learning-based offloading approaches suffer from the slow convergence caused by high-dimensional action space. In this paper, we propose a new model-free deep reinforcement learning-based online computation offloading approach for blockchain-empowered mobile edge computing in which both mining tasks and data processing tasks are considered. First, we formulate the online offloading problem as a Markov decision process by considering both the blockchain mining tasks and data processing tasks. Then, to maximize long-term offloading performance, we leverage deep reinforcement learning to accommodate highly dynamic environments and address the computational complexity. Furthermore, we introduce an adaptive genetic algorithm into the exploration of deep reinforcement learning to effectively avoid useless exploration and speed up the convergence without reducing performance. Finally, our experimental results demonstrate that our algorithm can converge quickly and outperform three benchmark policies.

References

Page 1

	Year	Citations

Page 1