Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That's the size on disk, my man. When you quantize it to a smaller float size you lose precision on the weights and so the model is smaller. Then here they `mmap` the file and it only needs 6 GiB of RAM!


The size mentioned is already quantized (and to integers, not floats). mmap obviously doesn't do any quantization.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: