Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You can download a model from a site like huggingface. Here is a list of models that can be used for inference:

https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderb...

This user has some models already compiled for use with GGML (look for models with that in the name):

https://huggingface.co/TheBloke

Or if you want to convert your own model the llama.cpp repo has good instructions. Briefly it's `python3 convert.py <model>` - then if you are using a large parameter model you may need to quantize it to fit in memory `./quantize <source_model> <destination_name> <quantization>`



Thanks




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: