For the complete documentation index, see llms.txt. Markdown versions of all docs pages are available by appending .md to any docs URL.
Multiple LLM providers
Configure load balancing across multiple LLM providers.
Create a group of LLM providers for the same route. agentgateway automatically load balances requests across the providers in the group using the Power of Two Choices (P2C) algorithm. This algorithm picks two random providers, scores each one based on health, latency, and pending requests, and routes the request to the higher-scoring provider. All providers in a single group are treated as equally preferred — P2C distributes traffic across healthy providers but does not implement failover.
Load balancing vs. failover: The single-group configuration on this page is load balancing, not failover. Failover requires multiple priority groups and a health/eviction policy. When all providers in a priority group are evicted (for example, due to repeated errors or rate limiting), the gateway automatically routes to the next priority group. For a failover example, see the Kubernetes deployment of agentgateway.
The P2C algorithm provides better performance than simple round-robin, random, or least-connections strategies by adapting in real-time to each provider’s health and performance characteristics.
Reusable providers in simplified LLM mode
For simplified llm configuration, you can define named provider defaults once in llm.providers[] and reference them from multiple llm.models[] entries with provider.reference. This is different from the previous group example. Here, the reusable provider acts as a preset, not as a load-balancing pool.
llm:
providers:
- name: openai-default
provider: openAI
params:
apiKey: "$OPENAI_API_KEY"
- name: openai-backup
provider: openAI
params:
apiKey: "$OPENAI_BACKUP_API_KEY"
models:
- name: fast
provider:
reference: openai-default
params:
model: gpt-4o-mini
- name: smart
provider:
reference: openai-backup
params:
model: gpt-4oWhen a model references a named provider with provider.reference, provider defaults are reused automatically. Keep shared settings on llm.providers[], and only override params.model on the model itself.
llm:
providers:
- name: openai-default
provider: openAI
params:
apiKey: "$OPENAI_API_KEY"
models:
- name: smart
provider:
reference: openai-default
params:
model: gpt-4oIn this example, smart inherits the upstream API key from llm.providers[] and only changes the model name.
Named providers can hold shared upstream settings you want to reuse, such as authentication, host overrides, path overrides, or other model defaults. Keep the shared values on llm.providers[] and only set per-model differences on llm.models[].
Configuration
binds/listeners/routes configuration format. For more information, see the Routing-based configuration guide.Review the following example configuration. The example sets two providers, OpenAI and Gemini. Each provider can have its own individual settings, such as host and path overrides, API keys, backend TLS, and more.
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
binds:
- port: 3000
listeners:
- routes:
- backends:
- ai:
groups:
- providers:
- name: openai
provider:
openAI:
# Optional; overrides the model in requests
model: gpt-3.5-turbo
backendAuth:
key: "$OPENAI_API_KEY"
- name: gemini
provider:
gemini:
# Optional; overrides the model in requests
model: gemini-1.5-flash-latest
backendAuth:
key: "$GEMINI_API_KEY"