I disagree. Ollama’s reason to be is to make things simple, not to always be on ...

I disagree. Ollama’s reason to be is to make things simple, not to always be on the cutting edge. I use Ollama when I can because of this simplicity. Since I bought a 32G integrated memory Mac 18 months ago, I have run so many models using Ollama, with close to zero problems.

The simple thing to do is to just use the custom quantization that OpenAI used for gpt-oss and use GGUF for other models.

Using Huggingface, LM Studio, etc. is the Linux metaphor of flexibility. Using Ollama is sort of like using macOS