Publication | Open Access
Fidelity-Based Probabilistic Q-Learning for Control of Quantum Systems
151
Citations
49
References
2013
Year
Artificial IntelligenceEngineeringValue Function ApproximationEducationReinforcement Learning (Educational Psychology)Reinforcement Learning MethodsLearning ControlLifelong Reinforcement LearningQuantum ProgrammingQuantum ComputingReinforcement Learning (Computer Engineering)Quantum SystemsQuantum Optimization AlgorithmQuantum Machine LearningQuantum AnnealingQuantum ScienceQuantum FeedbackQuantum InformationComputer ScienceFidelity-based Probabilistic Q-learningDeep Reinforcement LearningQuantum DevicesQuantum AlgorithmsFpql Algorithm
Reinforcement learning, particularly Q‑learning, faces a fundamental challenge in balancing exploration and exploitation. This work proposes a fidelity‑based probabilistic Q‑learning algorithm to address that trade‑off and applies it to quantum system control. The algorithm iteratively updates action probabilities using fidelity as a guiding signal within a probabilistic Q‑learning framework, and is evaluated on spin‑1/2 and Λ‑type atomic systems. Experiments show that FPQL achieves a superior exploration–exploitation balance, escapes local optima, and speeds up learning compared to conventional methods.
The balance between exploration and exploitation is a key problem for reinforcement learning methods, especially for Q-learning. In this paper, a fidelity-based probabilistic Q-learning (FPQL) approach is presented to naturally solve this problem and applied for learning control of quantum systems. In this approach, fidelity is adopted to help direct the learning process and the probability of each action to be selected at a certain state is updated iteratively along with the learning process, which leads to a natural exploration strategy instead of a pointed one with configured parameters. A probabilistic Q-learning (PQL) algorithm is first presented to demonstrate the basic idea of probabilistic action selection. Then the FPQL algorithm is presented for learning control of quantum systems. Two examples (a spin-1/2 system and a Λ-type atomic system) are demonstrated to test the performance of the FPQL algorithm. The results show that FPQL algorithms attain a better balance between exploration and exploitation, and can also avoid local optimal policies and accelerate the learning process.
| Year | Citations | |
|---|---|---|
Page 1
Page 1