DeepSeek-V2/V3/R1's model architecture is very different from what Fireworks/Together/... were used to.
That's their "business" model (okay, they don't care about business that much for now, but still) too: you can't run it efficiently without doing months of work we already did, so even with all weights open you can't compete with us.
That's their "business" model (okay, they don't care about business that much for now, but still) too: you can't run it efficiently without doing months of work we already did, so even with all weights open you can't compete with us.