已收录 268921 条政策
 政策提纲
  • 暂无提纲
Exact solution of the Bellman equation for aβ-discounted reward in a two-armed bandit with switching arms
[摘要] We consider the symmetric Poissonian two-armed bandit problem. For the case of switching arms, only one of which creates reward, we solve explicitly the Bellman equation for aβ-discounted reward and prove that a myopic policy is optimal.
[发布日期]  [发布机构] 
[效力级别]  [学科分类] 应用数学
[关键词]  [时效性] 
   浏览次数:4      统一登录查看全文      激活码登录查看全文