Markov Decision Processes with Imprecise Transition Probabilities

TLDR

The imprecise transition probability model is well suited for representing statistically derived confidence limits and natural language likelihood statements. The study introduces new numerical algorithms and bounds for infinite‑horizon, discrete‑stage, finite‑state, finite‑action Markov decision processes with imprecise transition probabilities. Assuming each state‑action transition probability vector is defined by linear inequalities, the authors compute an optimal max‑min strategy using successive approximations, reward revision, and modified policy iteration. The resulting bounds are at least as tight as those available for precise transition probability models.

Abstract

We present new numerical algorithms and bounds for the infinite horizon, discrete stage, finite state and action Markov decision process with imprecise transition probabilities. We assume that the transition probability mass vector for each state and action is described by a finite number of linear inequalities. This model of imprecision appears to be well suited for describing statistically determined confidence limits and/or natural language statements of likelihood. The numerical procedures for calculating an optimal max-min strategy are based on successive approximations, reward revision, and modified policy iteration. The bounds that are determined are at least as tight as currently available bounds for the case where the transition probabilities are precise.

References

Page 1

	Year	Citations

Page 1