Neurogammon is a computer backgammon program written by Gerald Tesauro at IBM's Thomas J. Watson Research Center. It was the first viable computer backgammon program implemented as a neural net, and set a new standard in computer backgammon play. It won the 1st Computer Olympiad in London in 1989, handily defeating all opponents.[1] Its level of play was that of an intermediate-level human player.[2]
Neurogammon contains seven separate neural networks, each with a single hidden layer. One network makes doubling-cube decisions; the other six choose moves at different stages of the game. The networks were trained by backpropagation from transcripts of 400 games in which the author played himself. The author's move was taught as the best move in each position.
In 1992, Tesauro completed TD-Gammon, which combined a form of reinforcement learning with the human-designed input features of Neurogammon, and played at the level of a world-class human tournament player.
See also
editReferences
edit- ^ Tesauro, Gerald (1989). "Neurogammon Wins Computer Olympiad" (PDF). Neural Computation. 1 (3): 321–323. doi:10.1162/neco.1989.1.3.321. Retrieved 2010-02-20.
- ^ Tesauro, Gerald (March 1995). "Temporal Difference Learning and TD-Gammon". Communications of the ACM. 38 (3): 58–68. doi:10.1145/203330.203343. Retrieved 2010-02-08.
Further reading
edit- Mandziuk, Jacek (2010). "CI in Games – Selected Approaches". Knowledge-Free and Learning-Based Methods in Intelligent Game Playing. Berlin: Springer. pp. 71–89. ISBN 978-3-642-11677-3.