For the complete documentation index, see llms.txt. Markdown versions of all docs pages are available by appending .md to any docs URL.
Ollama
Verified Code examples on this page have been automatically tested and verified.Configure agentgateway to route LLM traffic to Ollama for local model inference
Configure Ollama to serve local models through agentgateway. Agentgateway 1.3 includes the first-class ollama provider and automatically uses http://localhost:11434/v1 unless you override it.
Before you begin
- Install the
agentgatewaybinary. Install Ollama.
Make sure that you have at least one model pulled locally.
ollama listIf not, pull a model.
ollama pull llama3.2
Configure agentgateway
Create a configuration file that routes requests to your local Ollama instance.
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
llm:
port: 3000
models:
- name: "*"
provider: ollama
params:
model: llama3.2Review the following table to understand this configuration.
| Setting | Description |
|---|---|
provider: ollama | Uses the built-in Ollama provider shortcut instead of the older openAI compatibility path. |
params.model | Sets the default Ollama model. The model must already exist in your local Ollama instance. |
params.baseUrl | Optional override for non-default Ollama endpoints. If omitted, agentgateway uses http://localhost:11434/v1. |
name: "*" | Matches any requested model name, so clients can request any model that Ollama has pulled. |
If Ollama is running somewhere other than http://localhost:11434/v1, override the base URL instead of using host overrides.
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
llm:
port: 3000
models:
- name: "*"
provider: ollama
params:
model: llama3.2
baseUrl: http://192.168.1.20:11434/v1Start agentgateway:
agentgateway -f config.yamlTest the configuration
Send a request to verify that agentgateway routes to Ollama. The model name in the request must match a model you already pulled with ollama pull.
curl http://localhost:3000/v1/chat/completions \
-H "content-type: application/json" \
-d '{
"model": "llama3.2",
"messages": [
{
"role": "user",
"content": "Explain Ollama in one sentence."
}
]
}' | jqTroubleshooting
Connection refused
What’s happening:
Requests to agentgateway return a 503 response or a connection refused error.
Why it’s happening:
Ollama is not running, is listening on a different address, or params.baseUrl points to the wrong endpoint.
How to fix it:
Verify Ollama is reachable directly.
curl http://localhost:11434/api/versionIf Ollama is not running, start it.
ollama serveIf you set
params.baseUrl, make sure it includes Ollama’s/v1prefix.
Model not found
What’s happening:
The response returns a model not found error.
Why it’s happening:
The requested model has not been pulled into your local Ollama instance.
How to fix it:
List available models.
ollama listPull the missing model.
ollama pull llama3.2
llm.models[].auth or llm.models[].tls. If your Ollama endpoint is behind HTTPS or requires authentication, configure llm.models[].tls and llm.models[].auth like any other upstream provider.