Note: Overall leaderboard rankings may not reflect true model quality — individual benchmarks give a clearer picture. ARC-Challenge MMLU GPQA GSM8K Artificial Analysis Intelligence Index v4.0
← Back to leaderboard

xbench

2 models

Top 10 Models Performance

tencent/youtu-llm-2b ######################################## 19.5
qwen/qwen3-4b ###################################### 18.4
6.9K – 862.0B
2019 – 2026
Rank Model Score
🥇 tencent/youtu-llm-2b 19.5
🥈 qwen/qwen3-4b 18.4