Brick Tic-Tac-Toe: Exploring the Generalizability of AlphaZero to Novel Test Environments

Min, John Tan Chong; Motani, Mehul

Computer Science > Machine Learning

arXiv:2207.05991 (cs)

[Submitted on 13 Jul 2022 (v1), last revised 14 Jul 2022 (this version, v2)]

Title:Brick Tic-Tac-Toe: Exploring the Generalizability of AlphaZero to Novel Test Environments

Authors:John Tan Chong Min, Mehul Motani

View PDF

Abstract:Traditional reinforcement learning (RL) environments typically are the same for both the training and testing phases. Hence, current RL methods are largely not generalizable to a test environment which is conceptually similar but different from what the method has been trained on, which we term the novel test environment. As an effort to push RL research towards algorithms which can generalize to novel test environments, we introduce the Brick Tic-Tac-Toe (BTTT) test bed, where the brick position in the test environment is different from that in the training environment. Using a round-robin tournament on the BTTT environment, we show that traditional RL state-search approaches such as Monte Carlo Tree Search (MCTS) and Minimax are more generalizable to novel test environments than AlphaZero is. This is surprising because AlphaZero has been shown to achieve superhuman performance in environments such as Go, Chess and Shogi, which may lead one to think that it performs well in novel test environments. Our results show that BTTT, though simple, is rich enough to explore the generalizability of AlphaZero. We find that merely increasing MCTS lookahead iterations was insufficient for AlphaZero to generalize to some novel test environments. Rather, increasing the variety of training environments helps to progressively improve generalizability across all possible starting brick configurations.

Comments:	AlphaZero, Generalization
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2207.05991 [cs.LG]
	(or arXiv:2207.05991v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2207.05991

Submission history

From: Chong Min John Tan [view email]
[v1] Wed, 13 Jul 2022 06:53:46 UTC (1,285 KB)
[v2] Thu, 14 Jul 2022 03:29:55 UTC (1,284 KB)

Computer Science > Machine Learning

Title:Brick Tic-Tac-Toe: Exploring the Generalizability of AlphaZero to Novel Test Environments

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Brick Tic-Tac-Toe: Exploring the Generalizability of AlphaZero to Novel Test Environments

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators