MBPP
18 models
Top 10 Models Performance
| qwen/qwen3-4b | ######################################## | 92.3 |
| tencent/youtu-llm-2b | ##################################### | 85 |
| qwen/qwen3-1.7b | ################################### | 80.5 |
| tiiuae/falcon-h1-1.5b-deep-base | ############################### | 70.9 |
| tiiuae/falcon-h1-1.5b-deep-instruct | ############################## | 68.25 |
| yandex/gpt-5-lite-pretrain | ############################## | 68.2 |
| huggingfacetb/smollm3-3b | ############################# | 66.7 |
| google/gemma-3-27b-pt | ############################ | 65.6 |
| tiiuae/falcon-h1-1.5b-base | ############################ | 65.08 |
| tiiuae/falcon-h1-1.5b-instruct | ############################ | 64.81 |
| Rank | Model | Score |
|---|---|---|
| 🥇 | qwen/qwen3-4b | 92.3 |
| 🥈 | tencent/youtu-llm-2b | 85 |
| 🥉 | qwen/qwen3-1.7b | 80.5 |
| 4 | tiiuae/falcon-h1-1.5b-deep-base | 70.9 |
| 5 | tiiuae/falcon-h1-1.5b-deep-instruct | 68.25 |
| 6 | yandex/gpt-5-lite-pretrain | 68.2 |
| 7 | huggingfacetb/smollm3-3b | 66.7 |
| 8 | google/gemma-3-27b-pt | 65.6 |
| 9 | tiiuae/falcon-h1-1.5b-base | 65.08 |
| 10 | tiiuae/falcon-h1-1.5b-instruct | 64.81 |
| 11 | google/gemma-3-12b-pt | 60.4 |
| 12 | qwen/qwen3-1.7b-base | 55.4 |
| 13 | deepseek-ai/deepseek-r1-distill-qwen-1.5b | 51.5 |
| 14 | google/gemma-3-4b-pt | 46 |
| 15 | qwen/qwen2.5-1.5b | 43.6 |
| 16 | qwen/qwen3-0.6b-base | 36.6 |
| 17 | qwen/qwen2.5-0.5b | 29.8 |
| 18 | google/gemma-3-1b-pt | 9.2 |