Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's 480B params, not 480GB. The 4 bit version of this is 270GB. I believe it's trained at bf16, so you need over a TB of memory to operate the model at bf16. No one should be trying to replace claude with a quantized 8 bit or 4 bit model. It's simply not possible. Also, this model isn't going to be as versed as Claude at certain libraries and languages. I have something written entirely my claude which uses the Fyne library extensively in golang for UI. Claude knows it inside and out as it's all vibe coded, but the 4 bit Qwen3 coder just hallucinated functions and parameters that don't exist because it wasn't willing to admit it didn't know what it was doing. Definitely don't judge a model by it's quant is all I'm saying.


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: