Note: Overall leaderboard rankings may not reflect true model quality — individual benchmarks give a clearer picture. ARC-Challenge MMLU GPQA GSM8K Artificial Analysis Intelligence Index v4.0
← Back to leaderboard

ChessBench

5 models

ChessBench measures an AI's ability to play chess at a grandmaster level without performing explicit search — by training large-scale transformers via supervised learning on 10 million annotated chess games (15 billion data points) from Lichess, with legal moves and state‑values provided by Stockfish 16, the world‑leading chess engine.

Top 10 Models Performance

x-ai/grok-4.1-fast ######################################## 58.7
google/gemini-3.1-pro-preview ###################################### 55.3
openai/gpt-5.5 #################### 29.3
qwen/qwen3.6-plus ################### 28
anthropic/claude-opus-4.7 ############# 18.7
69K – 862.0B
2019 – 2026
Rank Model Score
🥇 x-ai/grok-4.1-fast 58.7
🥈 google/gemini-3.1-pro-preview 55.3
🥉 openai/gpt-5.5 29.3
4 qwen/qwen3.6-plus 28
5 anthropic/claude-opus-4.7 18.7