Publication | Closed Access
An Asymptotically Efficient Simulation-Based Algorithm for Finite Horizon Stochastic Dynamic Programming
21
Citations
14
References
2007
Year
Mathematical ProgrammingEngineeringStochastic OptimizationProbability DistributionSystems EngineeringDynamic ProgrammingOptimal PolicySimulation-based AlgorithmModeling And SimulationComputer ScienceCombinatorial OptimizationSimulation OptimizationStochastic DynamicDynamic OptimizationOperations Research
We present a simulation-based algorithm called "Simulated Annealing Multiplicative Weights" (SAMW) for solving large finite-horizon stochastic dynamic programming problems. At each iteration of the algorithm, a probability distribution over candidate policies is updated by a simple multiplicative weight rule, and with proper annealing of a control parameter, the generated sequence of distributions converges to a distribution concentrated only on the best policies. The algorithm is "asymptotically efficient," in the sense that for the goal of estimating the value of an optimal policy, a provably convergent finite-time upper bound for the sample mean is obtained
| Year | Citations | |
|---|---|---|
Page 1
Page 1