DeepSeek cost just over $5M to train. StarCoder cost around $1M, there is no info for Starcoder2 but unlikely to be more than a few million. The idea of spending billions in training is OpenAI trying to build a moat that might not actually exist.
These architectures didn't exist last year. The Chinese are innovating thanks to massive government backing, access to talent, and a clear focus on winning the AI race.
StarCoder is an open-source LLM for code, not text. Builder.ai told investors they were building a virtual assistant called "Natasha", not a code assistant.
I'm just telling you what I read online. Builder.ai wasn't competing with GitHub Copilot, Cody, or CodeWhisperer. Those are code assistants for developers. Builder.ai was building a virtual assistant for customers. It meant to "talk" to clients, gather requirements and automate parts of the build process. Very different space.
And like I said before, creating a dedicated, pre-trained foundation model is expensive. Not to mention a full LLM.
The point is that one doesn't need billions to create useful models — there is no reason they would need new foundation models in the first place — and that 'running out of money' is unlikely to have been their main problem, 250M should be more than enough to create an AI website builder.
I get what you are saying, but you're missing the most important piece of the puzzle: the data.
Everyone talks about models and infrastructure, but without the right data, you've got nothing. And that's where the biggest hidden cost is.
According to the company's own website, they were creating the data themselves. Labelled datasets, transcripts, customer conversations, documents, and more. That's where the real money goes. Millions and millions.