Indeed, Ollama is going a shady route. https://github.com/ggml-org/llama.cpp/pull/11016#issuecomment-2599740463
I started playing with Ramalama (the name is a mouthful) and it works great. There is one or two more steps in the setup but I’ve achieved great performance and the project is making good use of standards (OCI, jinja, unmodified llama.cpp, from what I understand).
Go and check it out, they are compatible with models from HF and Ollama too.
Perhaps give Ramalama a try?
https://github.com/containers/ramalama