Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

M5 is supposed to support FP4 natively which would explain the speed up on Q4 quantized models (down from BF16).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: