Artificial Analysis Agentic Index (Maximum Reasoning)

31 models

Rank	Model	Score
🥇	anthropic/claude-opus-4.8	77.8
🥈	openai/gpt-5.5	74.1
🥉	anthropic/claude-opus-4.7	71.3
4	google/gemini-3.5-flash	70.3
5	openai/gpt-5.4	68
6	xiaomi/mimo-v2.5-pro	67.4
7	deepseek-ai/deepseek-v4-pro	67.2
8	zai-org/glm-5.1	67.1
9	qwen/qwen3.7-max	66.6
10	moonshotai/kimi-k2.6	66
11	x-ai/grok-4.3	65.9
12	qwen/qwen3.6-max	64.8
13	anthropic/claude-sonnet-4.6	63
14	meta/muse-spark	62
15	minimaxai/minimax-m2.7	61.5
16	openai/gpt-5.3-codex	60.5
17	openai/gpt-5.2	60.2
18	openai/gpt-5.2-codex	56.5
19	mistralai/mistral-medium-3.5-128b	53.2
20	x-ai/grok-4.1-fast	49.3
21	x-ai/grok-4-fast	39.5
22	x-ai/grok-code-fast-1	35.6
23	google/gemma-4-26b-a4b-it	32.1
24	openai/o1	31.1
25	nousresearch/hermes-4-405b	12.6
26	nousresearch/hermes-4-70b	11.7
27	meta-llama/llama-3.3-70b-instruct	9.1
28	meta-llama/llama-4-maverick-17b-128e-instruct	7.2
29	qwen/qwen3-0.6b	7
30	google/gemma-4-e2b-it	6.9
31	deepseek-ai/deepseek-r1	3.8