There is also (or at least used to be?) an OpenAI compatible API layer for Ollama so that may be an option as well, though my understanding is there are some downsides to using that.
Note: This comment and the link are just meant as references/conveniences, not intended as a request for free labor. Thanks for opening up the code!
Forget ollama, just changing the URL from openai to your local server is enough, llama.cpp has a compatible endpoint. Most people just don't bother giving the option since you get a CORS error if it doesn't have a valid cert.