Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

i’m curious, what’s is the approach for maintainable and decoupled various gpu backends?


It was designed in #915 (read just the OP and the linked PRs at the end) and the implementation pretty much follows it closely, at least for the Metal backend. The CUDA and OpenCL backends are currently slightly coupled in ggml as they started developing before #915, but I think we'll resolve this eventually.

#915 - https://github.com/ggerganov/llama.cpp/discussions/915


interesting decoupling method, ty :)




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: