已收录 271055 条政策
 政策提纲
  • 暂无提纲
Reinforcement learning : theory, methods and application to decision support systems
[摘要] ENGLISH ABSTRACT: In this dissertation we study the machine learning subfield of Reinforcement Learning (RL).After developing a coherent background, we apply a Monte Carlo (MC) control algorithmwith exploring starts (MCES), as well as an off-policy Temporal-Difference (TD) learningcontrol algorithm, Q-learning, to a simplified version of the Weapon Assignment (WA)problem.For the MCES control algorithm, a discount parameter of τ = 1 is used. This gives verypromising results when applied to 7 × 7 grids, as well as 71 × 71 grids. The same discountparameter cannot be applied to the Q-learning algorithm, as it causes the Q-values todiverge. We take a greedy approach, setting ε = 0, and vary the learning rate (α ) and thediscount parameter (τ). Experimentation shows that the best results are found with setto 0.1 andconstrained in the region 0.4 ≤ τ ≤ 0.7.The MC control algorithm with exploring starts gives promising results when applied to theWA problem. It performs significantly better than the off-policy TD algorithm, Q-learning,even though it is almost twice as slow.The modern battlefield is a fast paced, information rich environment, where discovery ofintent, situation awareness and the rapid evolution of concepts of operation and doctrineare critical success factors. Combining the techniques investigated and tested in this workwith other techniques in Artificial Intelligence (AI) and modern computational techniquesmay hold the key to solving some of the problems we now face in warfare.
[发布日期]  [发布机构] Stellenbosch University
[效力级别]  [学科分类] 
[关键词]  [时效性] 
   浏览次数:3      统一登录查看全文      激活码登录查看全文