c4a0: Connect Four Alpha-Zero

An Alpha-Zero-style Connect Four engine trained entirely via self play.

The game logic, Monte Carlo Tree Search, and multi-threaded self play engine is written in rust here.

The NN is written in Python/PyTorch here and interfaces with rust via PyO3

Usage

Install clang

# Instructions for Ubuntu/Debian (other OSs may vary)
sudo apt install clang

Install uv for python dep/env management

curl -LsSf https://astral.sh/uv/install.sh | sh

Install deps and create virtual env:

uv sync

Compile rust code

uv run python -m ensurepip --upgrade
uv run maturin develop --release

Train a network

uv run python src/c4a0/main.py train --max-gens=10

Play against the network

uv run python src/c4a0/main.py play --model=best

(Optional) Download a connect four solver to objectively measure training progress:

git clone https://github.com/PascalPons/connect4.git solver
cd solver
make
# Download opening book to speed up solutions
wget https://github.com/PascalPons/connect4/releases/download/book/7x6.book

Now pass the solver paths to train, score and other commands:

uv run python src/c4a0/main.py score solver/c4solver solver/7x6.book

Results

After 9 generations of training (approx ~15 min on an RTX 3090) we achieve the following results:

Architecture

PyTorch NN `src/c4a0/nn.py`

A resnet-style CNN that takes in as input a baord position and outputs a Policy (probability distribution over moves weighted by promise) and Q Value (predicted win/loss value [-1, 1]).

Various NN hyperparameters can are sweepable via the nn-sweep command.

Connect Four Game Logic `rust/src/c4r.rs`

Implements compact bitboard representation of board state (Pos) and all connect four rules and game logic.

Monte Carlo Tree Search (MCTS) `rust/src/mcts.rs`

Implements Monte Carlo Tree Search - the core algorithm behind Alpha-Zero. Probabalistically explores potential game pathways and optimally hones in on the optimal move to play from any position.

MCTS relies on outputs from the NN. The output of MCTS helps train the next generation's NN.

Self Play `rust/src/self_play.rs`

Uses rust multi-threading to parallelize self play (training data generation).

Solver `rust/src/solver.rs`

Connect Four is a perfectly solved game. See Pascal Pons's great writeup on how to build a perfect solver. We can use these solutions to objectively measure our NN's performance. Importantly we never train on these solutions, instead only using our self-play data to improve the NN's performance.

solver.rs contains the stdin/out interface to learn the objective solutions to our training positions. Because solutions are expensive to compute, we cache them in a local rocksdb database (solutions.db). We then measure our training positions to see how often they recommend optimal moves as determined by the solver.

Name		Name	Last commit message	Last commit date
Latest commit History 197 Commits
.github/workflows		.github/workflows
.vscode		.vscode
images		images
rust		rust
src/c4a0		src/c4a0
tests/c4a0_tests		tests/c4a0_tests
.gitignore		.gitignore
.python-version		.python-version
LICENSE.md		LICENSE.md
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

c4a0: Connect Four Alpha-Zero

Usage

Results

Architecture

PyTorch NN `src/c4a0/nn.py`

Connect Four Game Logic `rust/src/c4r.rs`

Monte Carlo Tree Search (MCTS) `rust/src/mcts.rs`

Self Play `rust/src/self_play.rs`

Solver `rust/src/solver.rs`

About

Languages

License

advait/c4a0

Folders and files

Latest commit

History

Repository files navigation

c4a0: Connect Four Alpha-Zero

Usage

Results

Architecture

PyTorch NN src/c4a0/nn.py

Connect Four Game Logic rust/src/c4r.rs

Monte Carlo Tree Search (MCTS) rust/src/mcts.rs

Self Play rust/src/self_play.rs

Solver rust/src/solver.rs

About

Topics

Resources

License

Stars

Watchers

Forks

Languages

PyTorch NN `src/c4a0/nn.py`

Connect Four Game Logic `rust/src/c4r.rs`

Monte Carlo Tree Search (MCTS) `rust/src/mcts.rs`

Self Play `rust/src/self_play.rs`

Solver `rust/src/solver.rs`