Incentive learning in Monte Carlo tree search

Kuo Yuan Kao, I-Chen Wu, Shi Jim Yen, Yi Chang Shan

Research output: Contribution to journalArticle

3 Scopus citations

Abstract

Monte Carlo tree search (MCTS) is a search paradigm that has been remarkably successful in computer games like Go. It uses Monte Carlo simulation to evaluate the values of nodes in a search tree. The node values are then used to select the actions during subsequent simulations. The performance of MCTS heavily depends on the quality of its default policy, which guides the simulations beyond the search tree. In this paper, we propose an MCTS improvement, called incentive learning, which learns the default policy online. This new default policy learning scheme is based on ideas from combinatorial game theory, and hence is particularly useful when the underlying game is a sum of games. To illustrate the efficiency of incentive learning, we describe a game named Heap-Go and present experimental results on the game.

Original languageEnglish
Article number6468079
Pages (from-to)346-352
Number of pages7
JournalIEEE Transactions on Computational Intelligence and AI in Games
Volume5
Issue number4
DOIs
StatePublished - 1 Dec 2013

Keywords

  • Artificial intelligence
  • combinatorial games
  • computational intelligence
  • computer games
  • reinforcement learning

Fingerprint Dive into the research topics of 'Incentive learning in Monte Carlo tree search'. Together they form a unique fingerprint.

Cite this