> Building on Mistral Small 3, this new model comes with improved text performance, multimodal understanding, and an expanded context window of up to 128k tokens. The model outperforms comparable models like Gemma 3 and GPT-4o Mini, while delivering inference speeds of 150 tokens per second.
This is a really nice bump on the previous model, considering it’s now multimodal. I’m a little surprised it only received a 0.1 version bump.
If you're using ollama, the 24b Mistral Small is 3 rather than 3.1, which is lacking the 130k context window and multi modality. Still a very capable model though.
This is a really nice bump on the previous model, considering it’s now multimodal. I’m a little surprised it only received a 0.1 version bump.