Adding Expert Knowledge and Exploration in Monte-Carlo Tree Search

Chaslot, Guillaume; Fiter, Christophe; Hoock, Jean-Baptiste; Rimmel, Arpad; Teytaud, Olivier

doi:10.1007/978-3-642-12993-3_1

Guillaume Chaslot¹⁸,
Christophe Fiter¹⁹,
Jean-Baptiste Hoock¹⁹,
Arpad Rimmel¹⁹ &
…
Olivier Teytaud¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6048))

Included in the following conference series:

Advances in Computer Games

Abstract

We present a new exploration term, more efficient than classical UCT-like exploration terms. It combines efficiently expert rules, patterns extracted from datasets, All-Moves-As-First values, and classical online values. As this improved bandit formula does not solve several important situations (semeais, nakade) in computer Go, we present three other important improvements which are central in the recent progress of our program MoGo.

We show an expert-based improvement of Monte-Carlo simulations for nakade situations; we also emphasize some limitations of this modification.
We show a technique which preserves diversity in the Monte-Carlo simulation, which greatly improves the results in 19x19.
Whereas the UCB-based exploration term is not efficient in MoGo, we show a new exploration term which is highly efficient in MoGo.

MoGo recently won a game with handicap 7 against a 9Dan Pro player, Zhou JunXun, winner of the LG Cup 2007, and a game with handicap 6 against a 1Dan pro player, Li-Chen Chien.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Adapting Improved Upper Confidence Bounds for Monte-Carlo Tree Search

Monte Carlo Tree Search with Robust Exploration

Can Monte-Carlo Tree Search learn to sacrifice?

Article 13 October 2016

References

Chaslot, G.M.J.B., Winands, M.H.M., Uiterwijk, J.W.H.M., van den Herik, H.J., Bouzy, B.: Progressive strategies for monte-carlo tree search. In: Wang, P., et al. (eds.) Proceedings of the 10th Joint Conference on Information Sciences (JCIS 2007), pp. 655–661. World Scientific Publishing Co. Pte. Ltd., Singapore (2007)
Chapter Google Scholar
Coulom, R.: Efficient selectivity and backup operators in monte-carlo tree search. In: Ciancarini, P., van den Herik, H.J. (eds.) CG 2006. LNCS, vol. 4630, pp. 72–83. Springer, Heidelberg (2007)
Chapter Google Scholar
Kocsis, L., Szepesvari, C.: Bandit-based monte-carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)
Chapter Google Scholar
Gelly, S., Silver, D.: Combining online and offline knowledge in uct. In: ICML 2007: Proceedings of the 24th international conference on Machine learning, New York, NY, USA, pp. 273–280. ACM Press, New York (2007)
Chapter Google Scholar
Brügmann, B.: Monte-Carlo Go (Unpublished) (1993)
Google Scholar
Bouzy, B., Helmstetter, B.: Monte-Carlo Go developments. In: van den Herik, H.J., Iida, H., Heinz, E.A. (eds.) 10th Advances in Computer Games, pp. 159–174 (2003)
Google Scholar
Coquelin, P.A., Munos, R.: Bandit algorithms for tree search. In: Proceedings of UAI 2007 (2007)
Google Scholar
Gelly, S., Hoock, J.B., Rimmel, A., Teytaud, O., Kalemkarian, Y.: The parallelization of monte-carlo planning. In: Proceedings of the International Conference on Informatics in Control, Automation and Robotics (ICINCO 2008), pp. 198–203 (2008) (to appear)
Google Scholar
Bouzy, B., Chaslot, G.M.J.B.: Bayesian generation and integration of k-nearest-neighbor patterns for 19x19 go. In: Kendall, G., Lucas, S. (eds.) IEEE 2005 Symposium on Computational Intelligence in Games, Colchester, UK, pp. 176–181 (2005)
Google Scholar
Coulom, R.: Computing elo ratings of move patterns in the game of go. In: Computer Games Workshop, Amsterdam, The Netherlands (2007)
Google Scholar
Bouzy, B., Chaslot, G.M.J.B.: Monte-Carlo Go Reinforcement Learning Experiments. In: Kendall, G., Louis, S. (eds.) IEEE 2006 Symposium on Computational Intelligence in Games, Reno, USA, pp. 187–194 (2006)
Google Scholar
Wang, Y., Gelly, S.: Modifications of UCT and sequence-like simulations for Monte-Carlo Go. In: IEEE Symposium on Computational Intelligence and Games, Honolulu, Hawaii, pp. 175–182 (2007)
Google Scholar
Bouzy, B.: Associating domain-dependent knowledge and Monte-Carlo approaches within a go program. In: Chen, K. (ed.) Information Sciences, Heuristic Search and Computer Game Playing IV, vol. 175, pp. 247–257 (2005)
Google Scholar
Ralaivola, L., Wu, L., Baldi, P.: SVM and pattern-enriched common fate graphs for the game of Go. In: Proceedings of ESANN 2005, pp. 485–490 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Games and AI Group, MICC, Faculty of Humanities and Sciences, Universiteit Maastricht, Maastricht, The Netherlands
Guillaume Chaslot
TAO (Inria), LRI, UMR 8623 (CNRS - Univ. Paris-Sud), bat 490 Univ. Paris-Sud, 91405, Orsay, France
Christophe Fiter, Jean-Baptiste Hoock, Arpad Rimmel & Olivier Teytaud

Authors

Guillaume Chaslot
View author publications
You can also search for this author in PubMed Google Scholar
Christophe Fiter
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Baptiste Hoock
View author publications
You can also search for this author in PubMed Google Scholar
Arpad Rimmel
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Teytaud
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Games and AI Group, MICC, Faculty of Humanities and Sciences, Universiteit Maastricht, Maastricht, The Netherlands
H. Jaap van den Herik
Institute for Knowledge and Agent Technology, Universiteit Maastricht,
Pieter Spronck

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chaslot, G., Fiter, C., Hoock, JB., Rimmel, A., Teytaud, O. (2010). Adding Expert Knowledge and Exploration in Monte-Carlo Tree Search. In: van den Herik, H.J., Spronck, P. (eds) Advances in Computer Games. ACG 2009. Lecture Notes in Computer Science, vol 6048. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12993-3_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-12993-3_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12992-6
Online ISBN: 978-3-642-12993-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Adding Expert Knowledge and Exploration in Monte-Carlo Tree Search

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Adapting Improved Upper Confidence Bounds for Monte-Carlo Tree Search

Monte Carlo Tree Search with Robust Exploration

Can Monte-Carlo Tree Search learn to sacrifice?

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Adding Expert Knowledge and Exploration in Monte-Carlo Tree Search

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Adapting Improved Upper Confidence Bounds for Monte-Carlo Tree Search

Monte Carlo Tree Search with Robust Exploration

Can Monte-Carlo Tree Search learn to sacrifice?

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation