Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The model is much smaller. Compute in China should be more expensive considering all the US restrictions


No one knows the size of o1. The only hint was a paper that suggested it was 200B parameters.

Meanwhile, DeepSeek R1 is known to be 671B parameters because it is open-source.


R1 is a mixture model with only 37B active params. So while it's definitely expensive to train, it's rather light on compute during inference. All you really need is lots of memory.


Electricity prices in China are about half of what they are in the US, I expect the rest is similar.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: