Decision forests for computer Go feature learning
[摘要] ENGLISH ABSTRACT: In computer Go, moves are typically selected with the aid of a tree searchalgorithm. Monte-Carlo tree search (MCTS) is currently the dominant algorithmin computer Go. It has been shown that the inclusion of domainknowledge in MCTS is able to vastly improve the strength of MCTS engines.A successful approach to representing domain knowledge in computer Gois the use of appropriately weighted tactical features and pattern features,which are comprised of a number of hand-crafted heuristics and a collectionof patterns respectively. However, tactical features are hand-crafted specificallyfor Go, and pattern features are Go-specific, making it unclear howthey can be easily transferred to other domains.As such, this work proposes a new approach to representing domainknowledge, decision tree features. These features evaluate a state-actionpair by descending a decision tree, with queries recursively partitioning thestate-action pair input space, and returning a weight corresponding to thepartition element represented by the resultant leaf node. In this work, decisiontree features are applied to computer Go, in order to determine theirfeasibility in comparison to state-of-the-art use of tactical and pattern features.In this application of decision tree features, each query in the decisiontree descent path refines information about the board position surroundinga candidate move.The results of this work showed that a feature instance with decision treefeatures is a feasible alternative to the state-of-the-art use of tactical andpattern features in computer Go, in terms of move prediction and playingstrength, even though computer Go is a relatively well-developed researcharea. A move prediction rate of 35.9% was achieved with tactical and decisiontree features, and they showed comparable performance to the state of theart when integrated into an MCTS engine with progressive widening.We conclude that the decision tree feature approach shows potential asa method for automatically extracting domain knowledge in new domains.These features can be used to evaluate state-action pairs for guiding searchbasedtechniques, such as MCTS, or for action-prediction tasks.
[发布日期] [发布机构] Stellenbosch University
[效力级别] [学科分类]
[关键词] [时效性]