Note: Overall leaderboard rankings may not reflect true model quality — individual benchmarks give a clearer picture. ARC-Challenge MMLU GPQA GSM8K Artificial Analysis Intelligence Index v4.0
← Back to leaderboard

arnir0/tiny-llm

7 benchmarks
WinoGrande 49.7283 TruthfulQA 27.295 HellaSwag 27 ARC-Easy 24.5614 MMLU 22.739 ARC-Challenge 19.398 WikiText-2 (-ppl) -162.4178