Publication | Closed Access
Deep Reinforcement Learning-Based Adaptive Computation Offloading for MEC in Heterogeneous Vehicular Networks
170
Citations
32
References
2020
Year
Mobile Data OffloadingVehicle CommunicationEngineeringEdge ComputingComputer EngineeringMulti-access Edge ComputingSystems EngineeringVehicle NetworkOptimal PolicyLow LatencyMobile ComputingComputer ScienceMobile Edge ComputingHeterogeneous Vehicular Networks
The vehicular network needs efficient and reliable data communication technology to maintain low latency. It is very challenging to minimize the energy consumption and data communication delay while the vehicle is moving and wireless channels and bandwidth are time-varying. With the help of the emerging mobile edge computing (MEC) server, vehicles and roadside units (RSUs) can offload computing tasks to MEC associated with base station (BS). However, the environment for offloading tasks to MEC, e.g., wireless channel states and available bandwidth, is unstable. Therefore, ensuring the efficiency of computation offloading under such an unstable environment is a challenge. In this work, we design a task computation offloading model in a heterogeneous vehicular network; this model takes into account multiple stochastic tasks, the variety of wireless channels and bandwidth. To obtain the tradeoff between the cost of energy consumption and the cost of data transmission delay and avoid curse of dimensionality caused by the complexity of the large action space, we propose an adaptive computation offloading method based on deep reinforcement learning (ACORL) that can address the continuous action space. ACORL adds the Ornstein-Uhlenbeck (OU) noise vector to the action space with different factors for each action to validate the exploration. Multi transmission equipment can execute local processing and computation offloading to MEC. Nevertheless, ACORL considers the variety of wireless channels and available bandwidth between adjacent time slots. The numerical results illustrate that the proposed ACORL can effectively learn the optimal policy, which outperforms the Dueling DQN and greedy policy in the stochastic environment.
| Year | Citations | |
|---|---|---|
Page 1
Page 1