Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

With `ollama run gemma3:27b-it-qat "What is blue"`, GPU memory usage is just a hair over 20GB, so no, probably not without a nerfed context window


Indeed, the default context length in ollama is a mere 2048 tokens.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: