Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Really short summary: LLaMa is better, even smaller LLaMa models.

Table format: Benchmark, Cerebras 13B, LLama 7B, LLama 13B, LLama 60B

HellaSwag, 51.3, 76.1, 79.2, 84.2

Piqa, 76.6, 79.8, 80.1, 82.8

Wino-Grande, 64.6, 70.1, 73.0, 77.0

Arc-e, 71.4, 72.8, 74.8, 78.9

Arc-c, 36.7, 47.6, 52.7, 56.0

OpenBookQA, 28.6, 57.2, 56.4, 60.2



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: