Note: Overall leaderboard rankings may not reflect true model quality — individual benchmarks give a clearer picture. ARC-Challenge MMLU GPQA GSM8K Artificial Analysis Intelligence Index v4.0
← Back to leaderboard

MT-Bench

4 models

Top 10 Models Performance

deepseek-ai/deepseek-v2.5 ######################################## 90.2
huggingfacetb/smollm2-135m-instruct ######### 19.8
tiiuae/falcon-h1-1.5b-deep-instruct #### 8.53
tiiuae/falcon-h1-1.5b-instruct #### 8.46
69K – 862.0B
2019 – 2026
Rank Model Score
🥇 deepseek-ai/deepseek-v2.5 90.2
🥈 huggingfacetb/smollm2-135m-instruct 19.8
🥉 tiiuae/falcon-h1-1.5b-deep-instruct 8.53
4 tiiuae/falcon-h1-1.5b-instruct 8.46