Artificial Analysis Coding Index (Maximum Reasoning)
18 models
Top 10 Models Performance
| anthropic/claude-opus-4.8 | ######################################## | 56.7 |
| openai/gpt-5.3-codex | ##################################### | 53.1 |
| openai/gpt-5.2 | ################################## | 48.7 |
| openai/gpt-5.2-codex | ############################## | 43 |
| mistralai/mistral-medium-3.5-128b | ######################### | 35.4 |
| x-ai/grok-4.1-fast | ###################### | 30.9 |
| x-ai/grok-4-fast | ################### | 27.4 |
| x-ai/grok-code-fast-1 | ################# | 23.7 |
| google/gemma-4-26b-a4b-it | ################ | 22.4 |
| openai/o1 | ############## | 20.5 |
| Rank | Model | Score |
|---|---|---|
| 🥇 | anthropic/claude-opus-4.8 | 56.7 |
| 🥈 | openai/gpt-5.3-codex | 53.1 |
| 🥉 | openai/gpt-5.2 | 48.7 |
| 4 | openai/gpt-5.2-codex | 43 |
| 5 | mistralai/mistral-medium-3.5-128b | 35.4 |
| 6 | x-ai/grok-4.1-fast | 30.9 |
| 7 | x-ai/grok-4-fast | 27.4 |
| 8 | x-ai/grok-code-fast-1 | 23.7 |
| 9 | google/gemma-4-26b-a4b-it | 22.4 |
| 10 | openai/o1 | 20.5 |
| 11 | nousresearch/hermes-4-405b | 16 |
| 12 | deepseek-ai/deepseek-r1 | 15.9 |
| 13 | meta-llama/llama-4-maverick-17b-128e-instruct | 15.6 |
| 14 | nousresearch/hermes-4-70b | 14.4 |
| 15 | openai/gpt-3.5-turbo | 10.7 |
| 16 | meta-llama/llama-3.3-70b-instruct | 10.7 |
| 17 | google/gemma-4-e2b-it | 9 |
| 18 | qwen/qwen3-0.6b | 0.9 |