Skip to content
Theme:

LLMs

LLM models are specified in settings via a string with format provider:model-id.

.aether/settings.json
{
"agents": [
{
"model": "zai:glm-5.1",
...
}
]
}
FormExampleMeaning
Single modelzai:glm-5.1Use one provider/model pair.
Alloyzai:glm-5.1,openai:gpt-5.5Round-robin between model specs each conversation turn.
CredentialsANTHROPIC_API_KEY
Model syntaxanthropic:<model-id>
Default provider modelclaude-sonnet-4-5-20250929
CredentialsOPENAI_API_KEY
Model syntaxopenai:<model-id>
CredentialsOAuth login stored in the configured credential store; OS keychain by default
Model syntaxcodex:<model-id>
Common modelcodex:gpt-5.5

Codex uses OpenAI’s Codex OAuth flow through the client. If no Codex credentials are available, Aether asks you to log in when the model is selected.

CredentialsOPENROUTER_API_KEY
Model syntaxopenrouter:<vendor>/<model-id>

Use the vendor/model format from the OpenRouter model list.

Aether has built-in OpenAI-compatible provider configs for:

ProviderSyntaxEnv var
DeepSeekdeepseek:<model-id>DEEPSEEK_API_KEY
Moonshotmoonshot:<model-id>MOONSHOT_API_KEY
ZAIzai:<model-id>ZAI_API_KEY
CredentialsGEMINI_API_KEY
Model syntaxgemini:<model-id>
CredentialsAWS credential chain (environment, config file, IAM role, etc.)
Model syntaxbedrock:<model-id>

Use the Bedrock foundation model or system profile model ID as the Aether model identity. Aether uses this value for context windows, multimodal/reasoning capability checks, and Bedrock prompt caching decisions.

If you route traffic through an inference profile, configure the profile ARN as a provider override instead of putting the ARN in model:

.aether/settings.json
{
"agents": [
{
"name": "BedrockProfile",
"description": "Routes Bedrock requests through an application inference profile",
"model": "bedrock:anthropic.claude-sonnet-4-5-20250929-v1:0",
"providers": {
"bedrock": {
"inferenceProfileArn": "arn:aws:bedrock:us-west-2:000000000000:application-inference-profile/000000000000"
}
},
"userInvocable": true
}
]
}

Do not use bedrock:arn:aws:bedrock:... as the model. The ARN only identifies the request target; it does not tell Aether which model capabilities or context window to use.

CredentialsNone
Model syntaxollama:<model-id>
Host overrideOLLAMA_HOST

Aether discovers models you have pulled into Ollama.

CredentialsNone
Model syntaxllamacpp:<model-id>
Host overrideLLAMA_CPP_HOST

Aether queries the server’s /v1/models endpoint for the currently loaded model.

Some providers support extended thinking. Set reasoningEffort on an agent to request a thinking budget:

ValueMeaning
"low"Minimal thinking
"medium"Moderate thinking
"high"Extended thinking
"xhigh"Maximum thinking budget for providers that expose a fourth level
.aether/settings.json
{
"agents": [
{
"name": "Plan",
"description": "Plans implementation strategy",
"model": "codex:gpt-5.5",
"reasoningEffort": "xhigh",
"userInvocable": true,
"prompts": [".aether/PLAN.md"]
}
]
}

Providers that do not support the requested effort ignore it or map it to their closest available setting.

An alloy spec is a comma-separated list of model specs. Each conversation turn uses the next provider in the list, cycling round-robin.

.aether/settings.json
{
"agents": [
{
"name": "Alloy",
"description": "Alternates between cloud and local models",
"model": "zai:glm-5.1,ollama:llama3.2",
"userInvocable": true,
"prompts": [".aether/ALLOY.md"]
}
]
}

Every model in the alloy must be usable in the current runtime environment: required credentials must be set, and local servers must be running.