Technical Note: Q-Learning

References

Page 1

	Year	Citations
Learning from delayed rewards Chris Watkins OpenGrey (Institut de l'Information Scientifique et Technique) Artificial IntelligenceEngineeringMachine LearningStochastic GameGame Theory	1989	5.5K
Learning to Predict by the Methods of Temporal Differences Richard S. Sutton Machine Learning EngineeringMachine LearningData ScienceTemporal DifferencesPredictive Analytics	1988	3.9K
Applied Dynamic Programming. Marshall Freimer, Richard Bellman, Stuart E. Dreyfus Journal of the American Statistical Association EngineeringDynamic EnvironmentDynamic ProgrammingComputer ScienceApplied Dynamic Programming	1964	2.2K
Self-improving reactive agents based on reinforcement learning, planning and teaching Long-Ji Lin Machine Learning Artificial IntelligenceEngineeringReinforcement Learning (Computer Engineering)Agent Decision-makingAutonomous Learning	1992	1.6K
Introduction to Stochastic Dynamic Programming. Donald A. Berry, Sheldon M. Ross Journal of the American Statistical Association Stochastic SimulationEngineeringStochastic ProcessesDynamic ProgrammingSystems Engineering	1986	1.2K
Temporal credit assignment in reinforcement learning Richard S. Sutton ScholarWorks@UMassAmherst (University of Massachusetts Amherst)	1984	778
Input generalization in delayed reinforcement learning: an algorithm and performance comparisons David Chapman, Leslie Pack Kaelbling	1991	249
Automatic programming of behavior-based robots using reinforcement learning Sridhar Mahadevan, Jonathan H. Connell Artificial Intelligence Artificial IntelligenceEngineeringRobotic AgentAutomationAction Model Learning	1992	150
Learning control of finite Markov chains with an explicit trade-off between estimation and control Mitsuo Satõ, K. Abe, Hiroshi Takeda IEEE Transactions on Systems Man and Cybernetics EngineeringMachine LearningValue Function ApproximationLearning ControlLifelong Reinforcement Learning	1988	34

Page 1