AI Goban/Baduk Board Detector

Creator · 2025 · 3 min read

Computer vision system that reads a physical Go board from a photo and outputs a digital board state.

Overview

A pipeline that takes a photo of a Go board and produces an SGF file with the current position. Started with classical computer vision (OpenCV, Hough transforms), then moved to deep learning with Meta's SAM 3 for segmentation, running on rented GPUs via Vast.ai.

Problem

Recording physical Go games requires manual move-by-move transcription or expensive dedicated hardware. A camera-based system could make any board game digitally analysable.

Constraints

Must work on varied boards, lighting conditions, and camera angles
No dedicated hardware, just a phone camera
GPU inference cost needs to stay low for batch processing

Approach

Iterative. Started with classical CV (Hough circles for stones, Hough lines for the grid), hit parameter sensitivity limits, then moved to zero-shot segmentation with SAM 3. Currently using Ultralytics SAM 3 with post-processing for stone filtering, running on rented GPUs.

Key Decisions

Moved from classical CV to SAM 3

Hough-based detection required per-board parameter tuning. SAM generalises across lighting and board types without domain-specific training.

Vast.ai over Google Colab for GPU inference

Colab sessions disconnect, GPU allocation is unreliable, and batch processing 50+ images needs stable sessions. Vast.ai gives predictable cost and full session control.

Tech Stack

Python OpenCV Meta SAM 3 Ultralytics PyTorch Vast.ai Google Colab

Result & Impact

92-95%

Stone detection accuracy
~2s per image

Inference time (GPU)

Handles varied real-world photos with minimal manual intervention. Outputs standard SGF files compatible with any Go software.

Learnings

Classical CV is a useful prototyping step but breaks down when you need robustness across conditions
SAM 3 segments everything, not just what you want. Post-processing to filter relevant masks is where most of the work goes
Renting GPUs on demand is more practical than free tier cloud notebooks for anything beyond experimentation

This project is an ongoing effort to build a camera-based Go board reader. The goal is to point a phone at a board, take a photo, and get back a digital representation of the position.

Current state

The pipeline uses Meta’s SAM 3 (via Ultralytics) for stone segmentation, with post-processing to filter stone masks from background noise based on circularity and size. Grid detection maps segmented stones to board intersections. The output is an SGF file.

Inference runs on rented RTX 4090s via Vast.ai at roughly $0.30/hour.

What’s next

Fine-tuning on Go-specific training data to improve edge-of-board accuracy
Temporal tracking across consecutive photos for full game recording
A simple API that accepts a photo and returns SGF

Related Writing

All projects