Note: Overall leaderboard rankings may not reflect true model quality — individual benchmarks give a clearer picture. ARC-Challenge MMLU GPQA GSM8K Artificial Analysis Intelligence Index v4.0
← Back to leaderboard

INCLUDE

9 models

Top 5 Models Performance

qwen/qwen3.5-27b ######################################## 82
tencent/hy3-preview-base ###################################### 78.64
qwen/qwen3.5-9b ##################################### 75.6
qwen/qwen3-1.7b-base ###################### 45.57
qwen/qwen3.5-0.8b #################### 40.6
69K – 862.0B
2019 – 2026
Rank Model Score
🥇 qwen/qwen3.5-27b 82
🥈 tencent/hy3-preview-base 78.64
🥉 qwen/qwen3.5-9b 75.6
4 qwen/qwen3-1.7b-base 45.57
5 qwen/qwen3.5-0.8b 40.6
6 qwen/qwen2.5-1.5b 39.55
7 qwen/qwen3-0.6b-base 34.26
8 google/gemma-3-1b-pt 25.62
9 qwen/qwen2.5-0.5b 24.74