As much as I love their work, I can't be the only one who really struggles to see a path to profitability for Mistral, right? How do you make money selling API access to a model which anyone else can spin up an API for (license is Apache 2.0) on AWS or GCP or similar? Do they have some sort of magic inference optimization that allows them to be cheaper per-token than other hosting providers? Why would I use their API instead of anybody else's?
Asking these questions as a genuine fan of this company—I really want to believe they can succeed and not go the way of StabilityAI.
If VC funding for AI dries up and the french continue investing in mistral that would prevent much of the damage of an AI winter that could make openai and anthropic fail
even if Google had, it would have been of little value, since running Google, even back then, required a lot of computers and a lot of humans.
which is rather like Mistral - running large models is expensive, and hosting lets you amortise that across lots of users who individually use the model very little.
google ostensibly started on beige boxes, though. They used whatever computers they could get cheaply and quickly, even older hardware sufficed. There was a niche global group of people who could make stuff like that work as a much larger compute system (beowulf, etc). I don't know that it took "a lot of humans" to bootstrap.
That hasn't been true for their largest model since the 2407 release of Mistral Large 2 (https://mistral.ai/news/mistral-large-2407/), it is however under a non-commercial license.
> How do you make money selling API access to a model which anyone else can spin up an API for (license is Apache 2.0) on AWS or GCP
Uhhh.. easily. Don’t host it in AWS or GCP where everyone is hosting their infrastructure on proprietary infrastructure with a 10x markup? Don’t hire thousands of unnecessary employees? Don’t bank on outrageous valuations? Lots of ways to compete with big tech.
I guess I was just under the impression that cloud inference is such a competitive market that it'd be nigh-on-impossible to compete with the major players.
Except Dropbox always kept their users tied to their platform, which allowed them to gradually enshittify their offering, starting with removal of directly addressable content in "Public" folder, and continuing through various changes and side products that all had very little to do with "a folder that syncs". Mistral can't successfully enshittify if they can't keep users captive.
It's somewhat common to open source the core yet still monetize a version (browser vendors, SaaS, games). People will still pay for convenience, reliability, or for the best product.
> Why would I use their API?
To pay 15 cents per million tokens instead of $5.
> license is Apache 2.0
There are browser vendors that use Chromium despite "competing" with Chrome - even though it's the same kind of web browser product, there are some benefits if they allow the other options to exist. The same can be said of open source games and frienemy situations like Uber vs Lyft in the early days - it doesn't necessarily hurt to have others playing your game, especially if you have a common enemy (Firefox, other games, cabs, respectively).
I run mostly Mistral offline in Terminal (via ollama cli) but in the case where I need a text-to-text LLM for an app and users pay for access to the LLM-powered stuff, why not use Mistral's API? Then I could have a super cheap app setup on Vercel or whatever and do everything through an API key. The app would "be AI" and yet it runs on a calculator for cents.
The main thing that comes to mind regarding "just spin it up on AWS" is the considerable backend needs (GPU) and cost to train and run LLMs. In the same way you ask why use the LLM's cloud option when I could use AWS' cloud option? could also ask the inverse (or just host it yourself for free after initial setup if it's cost you're after).
If you need geo located instances and some other specific requirements use IaaS, but otherwise I think IaaS like AWS and GCP are a nightmare to manage - the awful IAM experience, all the vendor-specific jargon, navigating the hell that is Amazon.com. For something like an LLM "just spin it up on AWS" is just funny when you really consider what you're getting yourself into.
My intention was less to imply "I'll just spin it up" and more to imply "Some competitor of Mistral's will spin it up". I agree that from my perspective as a casual user, Mistral's API is quite convenient. What I don't understand is why they aren't driven to zero-margin instantaneously by an onslaught of clones of their business model.
Seems okay at image descriptions, I suppose. Still a 12B model though, and doesn't always get OCR anywhere near correct. I tried it on Le Chat, and waiting for it to be on Ollama.
Anyone from Mistral here? The link to the docs is broken, and I really would like to know more about what the specifications are for calling this via API. Foremost, what's the maximum image size you can use via the API? Thank you!
Asking these questions as a genuine fan of this company—I really want to believe they can succeed and not go the way of StabilityAI.