LLMs
Aether supports many LLM providers out of the box. And, you can provide your own via implementing a trait.
You can specify the model an agent uses in your agent settings file.
{ "name": "my-agent", "model": "anthropic:claude-sonnet-4-5"}In the TUI, only LLM providers with configured credentials appear in the model selector.
Providers
Section titled “Providers”Anthropic
Section titled “Anthropic”| Credentials | ANTHROPIC_API_KEY |
| Model syntax | anthropic:<model-id> |
OpenRouter
Section titled “OpenRouter”| Credentials | OPENROUTER_API_KEY |
| Model syntax | openrouter:<vendor>/<model-id> |
OpenRouter proxies 100+ models from various vendors. Use the vendor/model format from their model list.
OpenAI
Section titled “OpenAI”| Credentials | OPENAI_API_KEY |
| Model syntax | openai:<model-id> |
| Credentials | OAuth login — no API key needed |
| Model syntax | codex:<model-id> |
Codex authenticates through a browser-based OAuth flow with OpenAI. On first use, Aether opens a browser window to complete the login. Credentials are stored securely in your OS keychain — you only need to log in once.
In the model selector, Codex models show a “Needs login” badge until you’ve authenticated.
Gemini
Section titled “Gemini”| Credentials | GEMINI_API_KEY |
| Model syntax | gemini:<model-id> |
DeepSeek
Section titled “DeepSeek”| Credentials | DEEPSEEK_API_KEY |
| Model syntax | deepseek:<model-id> |
AWS Bedrock
Section titled “AWS Bedrock”| Credentials | AWS credential chain (environment, config file, or IAM role) |
| Model syntax | bedrock:<model-id> |
Ollama
Section titled “Ollama”| Credentials | None — requires a running Ollama server |
| Model syntax | ollama:<model-id> |
Aether auto-discovers models from your Ollama instance. Any model you’ve pulled with ollama pull will appear in the model selector.
Set OLLAMA_HOST to override the default address (http://localhost:11434).
Llama.cpp
Section titled “Llama.cpp”| Credentials | None — requires a running llama.cpp server |
| Model syntax | llamacpp:<model-id> |
Llama.cpp serves a single model at a time. Aether queries the server’s /v1/models endpoint to discover the loaded model, which then appears in the model selector.
Set LLAMA_CPP_HOST to override the default address (http://localhost:8080).
Moonshot (Kimi)
Section titled “Moonshot (Kimi)”| Credentials | MOONSHOT_API_KEY |
| Model syntax | moonshot:<model-id> |
ZAI (Zhipu)
Section titled “ZAI (Zhipu)”| Credentials | ZAI_API_KEY |
| Model syntax | zai:<model-id> |
Bring your own
Section titled “Bring your own”Aether’s llm crate exposes the StreamingModelProvider trait, so you can integrate any LLM backend:
pub trait StreamingModelProvider: Send + Sync { fn stream_response(&self, context: &Context) -> LlmResponseStream; fn display_name(&self) -> String; fn context_window(&self) -> Option<u32>; fn model(&self) -> Option<LlmModel> { None }}Implement this trait on your own struct, and it can be passed directly to the agent builder. See the Custom Providers guide for a full walkthrough with examples.
Reasoning effort
Section titled “Reasoning effort”Some models support extended thinking. Set reasoningEffort on an agent to control the thinking budget:
| Level | Description |
|---|---|
"low" | Minimal thinking — fastest responses |
"medium" | Moderate thinking |
"high" | Extended thinking for complex tasks |
"xhigh" | Maximum thinking budget |
{ "name": "deep-thinker", "model": "anthropic:claude-sonnet-4-5", "reasoningEffort": "high", "..."}Alloying
Section titled “Alloying”Model alloying lets you combine multiple LLM providers into a single agent by comma-separating model specs. Each turn uses the next model in the list, cycling through in round-robin order.
Syntax
Section titled “Syntax”"model": "provider1:model1,provider2:model2,provider3:model3"Example
Section titled “Example”An agent that alternates between DeepSeek and Anthropic:
{ "name": "alloy-coder", "description": "Cost-optimized coding agent", "model": "deepseek:deepseek-chat,anthropic:claude-sonnet-4-5", "userInvocable": true}Turn 1 uses DeepSeek, turn 2 uses Anthropic, turn 3 uses DeepSeek, and so on.
Use cases
Section titled “Use cases”- Cost optimization — Alternate between expensive and cheaper models. Use a powerful model for complex turns and a lighter one for simple follow-ups.
- Redundancy — If one provider has an outage, the other turns still work.
- Comparison — See how different models handle the same conversation context.
Considerations
Section titled “Considerations”- Each model in the alloy must have its corresponding API key set
- Reasoning effort applies to all models in the alloy (models that don’t support it will ignore it)
- The conversation context is shared across all models — each model sees the full history regardless of which model generated previous turns