Publication | Closed Access
A Deep Reinforcement Learning Approach to Improve the Learning Performance in Process Control
64
Citations
31
References
2021
Year
Artificial IntelligenceEngineeringMachine LearningLearning ControlImproved Drl ControllerSystems EngineeringModel Predictive ControlRobot LearningModel-based LearningLearning PerformanceIntelligent ControlComputer EngineeringAction Model LearningSequential Decision MakingComputer ScienceDeep LearningDeep Reinforcement LearningPolicy GradientProcess ControlAi-based Process Optimization
Industrial process control relies on model‑based methods that require frequent maintenance, whereas reinforcement learning can update policies online through interaction with the environment. The paper proposes an improved deep deterministic actor‑critic predictor that separates immediate reward from the action‑value function to give the actor reliable gradients early, aiming for fast, stable learning to enhance controller adaptability. The method constructs an expectation‑based policy gradient under a normal‑state assumption and implements an improved deep deterministic actor‑critic predictor that decouples reward and value functions. Simulation results show that the proposed algorithm learns more stably and quickly than state‑of‑the‑art DRL methods, outperforms fine‑tuned PID and linear MPC controllers—especially on nonlinear processes—and suggests the improved DRL controller could be a valuable tool in practice.
Advanced model-based control methods have been widely used in industrial process control, but excellent performance requires regular maintenance of its model. Reinforcement learning can online update its policy through the observed data by interacting with the environment. Since a fast and stable learning process is required to improve the adaptability of the controller, we propose an improved deep deterministic actor critic predictor in this paper, where the immediate reward is separated from the action-value function to provide the actor with reliable gradient information at early stages. Then, an expectation form of policy gradient is developed based on the assumption that the state obeys the normal distribution. Simulation results show that the proposed algorithm achieves a more stable and faster learning procedure than those state-of-art deep reinforcement learning (DRL) algorithms. Meanwhile, the obtained policy achieves a more advantageous performance than the fine-tuned proportional integral and derivative (PID) and linear model predictive controllers, especially for those processes with nonlinearity. These indicate that the improved DRL controller has the potential to become an important tool in practical applications.
| Year | Citations | |
|---|---|---|
Page 1
Page 1