Well-Calibrated Result Probabilities and Human Move Prediction for Elite Classical Chess
Authors/Creators
Description
I describe two machine-learning models for elite classical chess: a result prediction model that outputs well-calibrated win/draw/loss probabilities for the current position, and a move prediction model that ranks legal candidate moves by how likely a strong human would play each one. Both are gradient-boosted decision-tree ensembles (LightGBM) built on features from Stockfish evaluations, an upstream human-imitation policy network, and a range of position- and game-level signals. The models are trained on ~266k classical chess games from The Week in Chess in which both players are rated 2400 Elo or higher, and evaluated on a 47k-game evaluation set held out from the training pool. On the result task the production model (which does not see the rating gap between the two players) reaches an expected calibration error of 0.002 on 4.17M held-out positions. On the move task the production model reaches 61.1% top-1 / 87.8% top-3 accuracy, versus 55.5% / 81.1% for an engine-best-move baseline evaluated on the same positions. Both models are deployed on chessds.com in two latency tiers (around 200 ms and 1 s per position on a single CPU core). I report scaling behavior, sliced metrics for both tasks, and a short ladder of toy baselines that situate the headline numbers against simpler alternatives.
Files
hamood_2026_chessds.pdf
Files
(711.3 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:da4cc55771375855fcb611541d957930
|
711.3 kB | Preview Download |