A Reinforcement Learning Approach for Optimizing the Age-of-Computing-Enabled IoT

Abstract

Age of Information (AoI) is a newly rising metric for measuring the freshness of information. In this article, we consider a multidevice computing-enabled Internet of Things (IoT) system with a common destination, in which the status update sampled by the device can be offloaded directly to the destination for computing or computed by the device and then delivered to the destination, and jointly design offloading and scheduling policies to minimize the average weighted sum of AoI and energy consumption. The challenge lies in computing mode selection and its strong coupling with scheduling decisions. To address this issue, we formulate the optimization problem as a bilevel discrete-time Markov decision process (MDP) and approximate the optimal solution by relative value iteration. Furthermore, the threshold structure of the MDP policy is shown. However, with the expansion of the system scale, the MDP policy will suffer from the curse of dimensionality. In light of this, we develop a learning-based algorithm based on emerging deep reinforcement learning (DRL) to reduce the dimensionality of state space and utilize a late experience storage method to train two heterogeneous artificial neural networks (ANNs) synchronously during the training process. Simulation results show the structure of the MDP policy and verify the performance of the DRL policy is near-optimal.

References

Page 1

	Year	Citations

Page 1