Or if you want to convert your own model the llama.cpp repo has good instructions. Briefly it's `python3 convert.py <model>` - then if you are using a large parameter model you may need to quantize it to fit in memory `./quantize <source_model> <destination_name> <quantization>`
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderb...
This user has some models already compiled for use with GGML (look for models with that in the name):
https://huggingface.co/TheBloke
Or if you want to convert your own model the llama.cpp repo has good instructions. Briefly it's `python3 convert.py <model>` - then if you are using a large parameter model you may need to quantize it to fit in memory `./quantize <source_model> <destination_name> <quantization>`