RiddleBench
6 models
Top 10 Models Performance
| openai/gpt-oss-120b | ######################################## | 69.26 |
| deepseek-ai/deepseek-v3 | ################################## | 58.28 |
| qwen/qwq-32b | ############################# | 50.86 |
| deepseek-ai/deepseek-r1 | ############################# | 50.56 |
| meta-llama/llama-3.3-70b-instruct | ################ | 27.48 |
| google/gemma-3-27b-it | ############## | 25.04 |
| Rank | Model | Score |
|---|---|---|
| 🥇 | openai/gpt-oss-120b | 69.26 |
| 🥈 | deepseek-ai/deepseek-v3 | 58.28 |
| 🥉 | qwen/qwq-32b | 50.86 |
| 4 | deepseek-ai/deepseek-r1 | 50.56 |
| 5 | meta-llama/llama-3.3-70b-instruct | 27.48 |
| 6 | google/gemma-3-27b-it | 25.04 |