AI Goban/Baduk Board Detector
Computer vision system that reads a physical Go board from a photo and outputs a digital board state.
Overview
A pipeline that takes a photo of a Go board and produces an SGF file with the current position. Started with classical computer vision (OpenCV, Hough transforms), then moved to deep learning with Meta's SAM 3 for segmentation, running on rented GPUs via Vast.ai.
Problem
Recording physical Go games requires manual move-by-move transcription or expensive dedicated hardware. A camera-based system could make any board game digitally analysable.
Constraints
- Must work on varied boards, lighting conditions, and camera angles
- No dedicated hardware, just a phone camera
- GPU inference cost needs to stay low for batch processing
Approach
Iterative. Started with classical CV (Hough circles for stones, Hough lines for the grid), hit parameter sensitivity limits, then moved to zero-shot segmentation with SAM 3. Currently using Ultralytics SAM 3 with post-processing for stone filtering, running on rented GPUs.
Key Decisions
Moved from classical CV to SAM 3
Hough-based detection required per-board parameter tuning. SAM generalises across lighting and board types without domain-specific training.
Vast.ai over Google Colab for GPU inference
Colab sessions disconnect, GPU allocation is unreliable, and batch processing 50+ images needs stable sessions. Vast.ai gives predictable cost and full session control.
Tech Stack
Python OpenCV Meta SAM 3 Ultralytics PyTorch Vast.ai Google Colab
Result & Impact
- 92-95%Stone detection accuracy
- ~2s per imageInference time (GPU)
Handles varied real-world photos with minimal manual intervention. Outputs standard SGF files compatible with any Go software.
Learnings
- Classical CV is a useful prototyping step but breaks down when you need robustness across conditions
- SAM 3 segments everything, not just what you want. Post-processing to filter relevant masks is where most of the work goes
- Renting GPUs on demand is more practical than free tier cloud notebooks for anything beyond experimentation
This project is an ongoing effort to build a camera-based Go board reader. The goal is to point a phone at a board, take a photo, and get back a digital representation of the position.
Current state
The pipeline uses Meta’s SAM 3 (via Ultralytics) for stone segmentation, with post-processing to filter stone masks from background noise based on circularity and size. Grid detection maps segmented stones to board intersections. The output is an SGF file.
Inference runs on rented RTX 4090s via Vast.ai at roughly $0.30/hour.
What’s next
- Fine-tuning on Go-specific training data to improve edge-of-board accuracy
- Temporal tracking across consecutive photos for full game recording
- A simple API that accepts a photo and returns SGF