I play Go. Not well, but enough to want to record my games without typing moves into an app. The idea: point a phone camera at the board, get a digital board state back.
This is an ongoing project. The approach has changed several times across iterations.
The pipeline
The first version was a classical CV pipeline in Python with OpenCV. No neural networks, no cloud GPUs. Just image processing.
import cv2
import numpy as np
img = cv2.imread("board.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (9, 9), 2)
circles = cv2.HoughCircles(
gray,
cv2.HOUGH_GRADIENT,
dp=1.2,
minDist=30,
param1=50,
param2=30,
minRadius=10,
maxRadius=40,
)
HoughCircles finds circular shapes in a grayscale image. The parameters control sensitivity: dp sets the accumulator resolution, minDist prevents overlapping detections, and param1/param2 control edge and centre thresholds.
Detecting the grid
Before finding stones, I needed the grid. Go boards have 19x19 lines (or 9x9 and 13x13 for smaller boards). Hough line detection picks these up.
edges = cv2.Canny(gray, 50, 150)
lines = cv2.HoughLinesP(
edges,
rho=1,
theta=np.pi / 180,
threshold=80,
minLineLength=100,
maxLineGap=10,
)
From the detected lines, I filtered for horizontal and vertical groups, sorted them, and computed intersection points. Each intersection is a board position.
Mapping stones to positions
With grid intersections and detected circles, the next step was matching. For each circle centre, find the nearest intersection point. If the distance is below a threshold, that intersection has a stone on it.
for circle in circles[0]:
cx, cy, r = circle
min_dist = float("inf")
nearest = None
for point in intersections:
d = np.sqrt((cx - point[0])**2 + (cy - point[1])**2)
if d < min_dist:
min_dist = d
nearest = point
if min_dist < grid_spacing * 0.4:
# classify as black or white based on pixel intensity
roi = gray[int(cy-r):int(cy+r), int(cx-r):int(cx+r)]
mean_val = np.mean(roi)
color = "black" if mean_val < 128 else "white"
Colour classification used the mean pixel intensity inside the circle. Dark region means black stone, light means white.
Where it fell apart
Controlled conditions (good lighting, overhead angle, clean board) gave around 85-90% accuracy. Real-world photos were worse.
The failure modes were predictable. Perspective distortion warps the grid, so line detection misses intersections near the edges. Shadows create false edges. Reflections on white stones confuse the intensity-based colour classification. Wooden boards have grain patterns that Hough picks up as extra lines.
I added perspective correction using cv2.getPerspectiveTransform with manually marked corners. That helped with the grid detection but added a manual step I wanted to avoid.
pts_src = np.float32([[x1,y1], [x2,y2], [x3,y3], [x4,y4]])
pts_dst = np.float32([[0,0], [600,0], [600,600], [0,600]])
M = cv2.getPerspectiveTransform(pts_src, pts_dst)
warped = cv2.warpPerspective(img, M, (600, 600))
The biggest problem was parameter tuning. Every new lighting condition or board type needed different values for minDist, param1, param2, and the line detection thresholds. There’s no single set of parameters that works across conditions.
Output format
The pipeline outputs an SGF file (Smart Game Format), the standard for recording Go games.
(;GM[1]SZ[19]
AB[dp][pp][dd][pd]
AW[fc][cf][cn][fq]
)
AB marks black stones, AW marks white. Coordinates use letter pairs where a is column/row 1.
What I took from this
Classical CV is a reasonable starting point for this problem. The concepts are straightforward and the code runs on any machine without a GPU. But the parameter sensitivity makes it impractical for a general-purpose tool. Every board, every lighting setup, every camera angle needs its own calibration.
The next step was to use a model that can learn what stones look like instead of hand-tuning thresholds.