Publication | Closed Access
A theoretical analysis of Model-Based Interval Estimation
146
Citations
8
References
2005
Year
Unknown Venue
Mathematical ProgrammingArtificial IntelligenceParameter EstimationEngineeringMachine LearningMarkov Decision ProcessesAverage LossOperations ResearchTheoretical AnalysisStochastic GameUncertainty QuantificationManagementInterval AnalysisRobot LearningDecision TheoryStatisticsNear-optimal PoliciesSequential Decision MakingComputer ScienceMarkov Decision ProcessExploration V ExploitationInterval ComputationStatistical Inference
Several algorithms for learning near-optimal policies in Markov Decision Processes have been analyzed and proven efficient. Empirical results have suggested that Model-based Interval Estimation (MBIE) learns efficiently in practice, effectively balancing exploration and exploitation. This paper presents the first theoretical analysis of MBIE, proving its efficiency even under worst-case conditions. The paper also introduces a new performance metric, average loss, and relates it to its less "online" cousins from the literature.
| Year | Citations | |
|---|---|---|
Page 1
Page 1