Concepedia

Publication | Open Access

A survey of some simulation-based algorithms for Markov decision processes

19

Citations

25

References

2007

Year

Abstract

Many problems modeled by Markov decision processes (MDPs) have very large state and/or action spaces, leading to the well-known curse of dimensionality that makes solution of the resulting models intractable. In other cases, the system of interest is complex enough that it is not feasible to explicitly specify some of the MDP model parameters, but simulated sample paths can be readily generated (e.g., for random state transitions and rewards), albeit at a non-trivial computational cost. For these settings, we have developed various sampling and population-based numerical algorithms to overcome the computational difficulties of computing an optimal solution in terms of a policy and/or value function. Specific approaches presented in this survey include multi-stage adaptive sampling, evolutionary policy iteration and evolutionary random policy search.

References

YearCitations

1998

26.8K

2005

25.7K

1996

8.7K

1995

8.4K

1984

6.6K

2002

5.7K

1985

2.4K

1994

611

1995

555

1994

457

Page 1