ChessBench

5 models

ChessBench measures an AI's ability to play chess at a grandmaster level without performing explicit search — by training large-scale transformers via supervised learning on 10 million annotated chess games (15 billion data points) from Lichess, with legal moves and state‑values provided by Stockfish 16, the world‑leading chess engine.

Top 10 Models Performance

x-ai/grok-4.1-fast	########################################	58.7
google/gemini-3.1-pro-preview	######################################	55.3
openai/gpt-5.5	####################	29.3
qwen/qwen3.6-plus	###################	28
anthropic/claude-opus-4.7	#############	18.7

Rank	Model	Score
🥇	x-ai/grok-4.1-fast	58.7
🥈	google/gemini-3.1-pro-preview	55.3
🥉	openai/gpt-5.5	29.3
4	qwen/qwen3.6-plus	28
5	anthropic/claude-opus-4.7	18.7