Note: Overall leaderboard rankings may not reflect true model quality — individual benchmarks give a clearer picture. ARC-Challenge MMLU GPQA GSM8K Artificial Analysis Intelligence Index v4.0

← Back to leaderboard

xbench

2 models

Top 10 Models Performance

tencent/youtu-llm-2b	########################################	19.5
qwen/qwen3-4b	######################################	18.4

Rank	Model	Score
🥇	tencent/youtu-llm-2b	19.5
🥈	qwen/qwen3-4b	18.4

JavaScript enhances filtering and charts. All data is rendered server-side.

View the sitemap for available pages.