Note: Overall leaderboard rankings may not reflect true model quality — individual benchmarks give a clearer picture. ARC-Challenge MMLU GPQA GSM8K Artificial Analysis Intelligence Index
← Back to leaderboard

Codeforces ELO

7 models

Top 10 Models Performance

google/gemma-4-31b-it ######################################## 2150
deepseek-ai/deepseek-v4-flash ################### 1000
deepseek-ai/deepseek-v4-pro ################### 1000
google/gemma-4-e4b-it ################# 940
deepseek-ai/deepseek-v3.2-speciale ################# 900
qwen/qwen3.5-122b-a10b ################ 851
qwen/qwen3.5-35b-a3b ############### 822
Rank Model Score
🥇 google/gemma-4-31b-it 2150
🥈 deepseek-ai/deepseek-v4-flash 1000
🥉 deepseek-ai/deepseek-v4-pro 1000
4 google/gemma-4-e4b-it 940
5 deepseek-ai/deepseek-v3.2-speciale 900
6 qwen/qwen3.5-122b-a10b 851
7 qwen/qwen3.5-35b-a3b 822