The model is based on Qwen2.5-Coder-7b it seems. I currently run some quantized ...

		mbitsnbites 9 months ago \| parent \| context \| favorite \| on: Zed now predicts your next edit with Zeta, our new... The model is based on Qwen2.5-Coder-7b it seems. I currently run some quantized variant of Qwen2.5-Coder-7b locally with llama.cpp and it fits nicely in the 8GB VRAM of my Radeon 7600 (with excellent performance BTW), so it looks like it should be perfectly possible. I would also only use Zeta locally.

Are you happy with the speed with your 8GB GPU?