Nonatomic total rewards Markov decision processes with multiple criteria
[摘要] We consider a Markov decision process with an uncountable state space for which the vector performance functional has the form of expected total rewards. Under the single condition that initial distribution and transition probabilities are nonatomic, we prove that the performance space coincides with that generated by nonrandomized Markov policies. We also provide conditions for the existence of optimal policies when the goal is to maximize one component of the performance vector subject to inequality constraints on other components. We illustrate our results with examples of production and financial problems. (C) 2002 Elsevier Science (USA). All rights reserved.
[发布日期] 2002-09-01 [发布机构]
[效力级别] [学科分类]
[关键词] [时效性]