Note: Overall leaderboard rankings may not reflect true model quality — individual benchmarks give a clearer picture. ARC-Challenge MMLU GPQA GSM8K Artificial Analysis Intelligence Index v4.0
← Back to leaderboard

Multi-IF

7 models

Top 10 Models Performance

liquid/lfm-2.5-8b-a1b ######################################## 79.93
google/gemma-3-4b-it ################################# 66.61
liquid/lfm-2-8b-a1b ############################# 58.19
meta-llama/llama-3.2-3b-instruct ######################### 50.91
liquid/lfm-2.5-350m ###################### 44.92
google/gemma-3-1b-it ###################### 44.25
openbmb/minicpm5-1b ###################### 43.54
69K – 862.0B
2019 – 2026
Rank Model Score
🥇 liquid/lfm-2.5-8b-a1b 79.93
🥈 google/gemma-3-4b-it 66.61
🥉 liquid/lfm-2-8b-a1b 58.19
4 meta-llama/llama-3.2-3b-instruct 50.91
5 liquid/lfm-2.5-350m 44.92
6 google/gemma-3-1b-it 44.25
7 openbmb/minicpm5-1b 43.54