Expose and Secure Your Self-Hosted Ollama API
Ollama is a locally deployed AI model runner, designed to allow users to download and execute large language models (LLMs) locally on your machine. By combining Ollama with ngrok, you can give your local LLM an endpoint on the internet, enabling remote access and integration with other applications.
But putting your entire Ollama API on the public internet might expose your LLM to abuse—instead, you can use Traffic Policy to add a layer of authentication to restrict access to only yourself or trusted colleagues.
1. Reserve a domain
Navigate to the Domains section of the ngrok dashboard and click New + to reserve a free static domain like https://your-ollama-llm.ngrok.app
or a custom domain you already own.
We'll refer to this domain as $NGROK_DOMAIN
from here on out.
2. Create a Traffic Policy file
On the system where Ollama runs, create a file named ollama.yaml
and paste in the following policy:
Loading…
What's happening here? This policy rewrites the Host
header of every HTTP request to localhost
so that Ollama accepts the requests.
3. Start your Ollama endpoint
On the same system where Ollama runs, start the agent on port 11434
, which is the default for Ollama, and reference the ollama.yaml
file you just created.
Be sure to also change $NGROK_URL
to the domain you reserved earlier.
Loading…
4. Try out your Ollama endpoint
You can use curl
in your terminal to send a prompt to your LLM through your $NGROK_DOMAIN
, replacing $MODEL
with the Ollama model you pulled.
Loading…
Optional: Protect your Ollama instance with Basic Auth
You may not want everyone to be able to access your LLM. ngrok can quickly add authentication to your LLM without any changes to your Ollama configuration.
Edit your ollama.yaml
file and add in the policy below.
Loading…
What's happening here?
This policy first checks whether the incoming HTTP request contains the appropriate Authorization: Basic
header and a base64-encoded version of one of the username:password
pairs you specified in ollama.yaml
.
Only requests with valid Basic Auth are passed through to your ngrok agent and forwarded to your Ollama API.
Restart your ngrok agent to apply the new policy.
Loading…
You can test your policy by sending the same LLM prompt to Ollama's API with the Authorization: Basic
header, once again replacing $NGROK_DOMAIN
and $MODEL
.
Loading…
If you send the same request without the Authorization header, you should receive a 401 Unauthorized
response.
Your personal LLM is now locked down to only accept authenticated users.
What's next?
- Read more about Traffic Policy, core concepts, and actions you might want to implement next, like IP restrictions instead of Basic Auth.
- Explore other ways to block unwanted requests from Tor users, search or AI bots, and more to protect your self-hosted LLM.
- View your Ollama traffic in Traffic Inspector.