ChessBench
5 models
ChessBench measures an AI's ability to play chess at a grandmaster level without performing explicit search — by training large-scale transformers via supervised learning on 10 million annotated chess games (15 billion data points) from Lichess, with legal moves and state‑values provided by Stockfish 16, the world‑leading chess engine.
Top 10 Models Performance
| x-ai/grok-4.1-fast | ######################################## | 58.7 |
| google/gemini-3.1-pro-preview | ###################################### | 55.3 |
| openai/gpt-5.5 | #################### | 29.3 |
| qwen/qwen3.6-plus | ################### | 28 |
| anthropic/claude-opus-4.7 | ############# | 18.7 |
| Rank | Model | Score |
|---|---|---|
| 🥇 | x-ai/grok-4.1-fast | 58.7 |
| 🥈 | google/gemini-3.1-pro-preview | 55.3 |
| 🥉 | openai/gpt-5.5 | 29.3 |
| 4 | qwen/qwen3.6-plus | 28 |
| 5 | anthropic/claude-opus-4.7 | 18.7 |