Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

No, your OP is mistaken. The model weights have to all be accessed for the forward pass. What has happened is that using mmap changes where the memory is consumed (kernel vs process) and so it was being incorrectly interpreted. There are still 30B parameters, and you'll need that times however big your floating point representation is to use the model still.


But do they all need to be accessed at the same time? If not, pages that are not being actively used can be dropped from memory until needed again.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: