Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
_ncyj
11 months ago
|
parent
|
context
|
favorite
| on:
Show HN: DeepSeek My User Agent
The model is much smaller. Compute in China should be more expensive considering all the US restrictions
minimaxir
11 months ago
|
next
[–]
No one knows the size of o1. The only hint was a paper that suggested it was 200B parameters.
Meanwhile, DeepSeek R1 is known to be 671B parameters because it is open-source.
sigmoid10
11 months ago
|
parent
|
next
[–]
R1 is a mixture model with only 37B active params. So while it's definitely expensive to train, it's rather light on compute during inference. All you really need is lots of memory.
portaouflop
11 months ago
|
prev
[–]
Electricity prices in China are about half of what they are in the US, I expect the rest is similar.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: