LLMs

LLM models are specified in settings via a string with format provider:model-id.

Examples

{
  "agents": [
    {
      "model": "zai:glm-5.1",
      ...
    }
  ]
}

Form	Example	Meaning
Single model	`zai:glm-5.1`	Use one provider/model pair.
Alloy	`zai:glm-5.1,openai:gpt-5.5`	Round-robin between model specs each conversation turn.

Supported Providers

Anthropic


Credentials	`ANTHROPIC_API_KEY`
Model syntax	`anthropic:<model-id>`
Default provider model	`claude-sonnet-4-5-20250929`

OpenAI


Credentials	`OPENAI_API_KEY`
Model syntax	`openai:<model-id>`

Codex


Credentials	OAuth login stored in the configured credential store; OS keychain by default
Model syntax	`codex:<model-id>`
Common model	`codex:gpt-5.5`

Codex uses OpenAI’s Codex OAuth flow through the client. If no Codex credentials are available, Aether asks you to log in when the model is selected.

OpenRouter


Credentials	`OPENROUTER_API_KEY`
Model syntax	`openrouter:<vendor>/<model-id>`

Use the vendor/model format from the OpenRouter model list.

OpenAI-compatible providers

Aether has built-in OpenAI-compatible provider configs for:

Provider	Syntax	Env var
DeepSeek	`deepseek:<model-id>`	`DEEPSEEK_API_KEY`
Fireworks AI	`fireworks:<model-id>`	`FIREWORKS_API_KEY`
Microsoft Foundry	`azure-foundry:<model-id>`	`AZURE_OPENAI_API_KEY`
Moonshot	`moonshot:<model-id>`	`MOONSHOT_API_KEY`
ZAI	`zai:<model-id>`	`ZAI_API_KEY`

Microsoft Foundry

Microsoft Foundry uses the OpenAI v1 Chat Completions API with API-key authentication only. Configure your resource-specific endpoint with providers.azure-foundry.url; the canonical form is https://<resource>.openai.azure.com/openai/v1. The model remains the catalog identity, while requestModel optionally specifies the Azure deployment name:

{
  "providers": {
    "azure-foundry": {
      "url": "https://my-resource.openai.azure.com/openai/v1",
      "requestModel": "production-coding-deployment"
    }
  },
  "agents": [{ "model": "azure-foundry:gpt-5.5", "userInvocable": true }]
}

Microsoft Entra ID is not supported yet. See the Microsoft Foundry Chat Completions documentation.

Fireworks AI

Fireworks uses https://api.fireworks.ai/inference/v1 by default. Serverless catalog IDs use accounts/fireworks/models/<model>. For a dedicated deployment, retain that catalog model identity and set requestModel to the deployment ID:

{
  "providers": {
    "fireworks": {
      "requestModel": "accounts/acme/deployments/coding-prod"
    }
  },
  "agents": [{ "model": "fireworks:accounts/fireworks/models/glm-5p1", "userInvocable": true }]
}

requestModel only changes the request target; model capabilities, context window, display name, and telemetry continue to use model. See the Fireworks querying guide.

Gemini


Credentials	`GEMINI_API_KEY`
Model syntax	`gemini:<model-id>`

AWS Bedrock


Credentials	AWS credential chain (environment, config file, IAM role, etc.)
Model syntax	`bedrock:<model-id>`

Use the Bedrock foundation model or system profile model ID as the Aether model identity. Aether uses this value for context windows, multimodal/reasoning capability checks, and Bedrock prompt caching decisions.

If you route traffic through an inference profile, configure the profile ARN as a provider override instead of putting the ARN in model:

{
  "agents": [
    {
      "name": "BedrockProfile",
      "description": "Routes Bedrock requests through an application inference profile",
      "model": "bedrock:anthropic.claude-sonnet-4-5-20250929-v1:0",
      "providers": {
        "bedrock": {
          "inferenceProfileArn": "arn:aws:bedrock:us-west-2:000000000000:application-inference-profile/000000000000"
        }
      },
      "userInvocable": true
    }
  ]
}

Do not use bedrock:arn:aws:bedrock:... as the model. The ARN only identifies the request target; it does not tell Aether which model capabilities or context window to use.

Models served by the Responses API (Bedrock Mantle)

Some Bedrock models are not available through the Converse API. Aether routes these to Bedrock’s OpenAI-compatible Responses endpoint (bedrock-mantle.${AWS_REGION}.api.aws) instead. The model catalog decides the transport automatically based on the model ID, so you configure them exactly like any other Bedrock model:

{
  "agents": [
    {
      "name": "Mantle",
      "description": "Bedrock-hosted OpenAI model via the Responses API",
      "model": "bedrock:openai.gpt-5.6-luna",
      "userInvocable": true
    }
  ]
}

Catalog IDs that use the Responses transport include the openai.gpt-5.6-*, openai.gpt-5.5, openai.gpt-5.4, xai.grok-4.3, and openai.gpt-oss-* families (for example, bedrock:openai.gpt-5.5 or bedrock:openai.gpt-oss-120b). All other Bedrock models use the Converse API described above.

Responses-transport requests authenticate with the first available scheme:

Bearer token — a Bedrock API key from the AWS_BEARER_TOKEN_BEDROCK environment variable. This takes precedence when set.
SigV4 — signed requests derived from the standard AWS credential chain (environment variables, ~/.aws/credentials, IAM role, SSO), so existing IAM/SSO setups work without extra configuration.

The endpoint region is resolved from AWS_REGION (then AWS_DEFAULT_REGION), defaulting to us-east-1.

Inference profiles are a Converse-API feature and do not apply to Responses-transport models. If you set inferenceProfileArn alongside a Mantle model, Aether ignores it.

Ollama


Credentials	None
Model syntax	`ollama:<model-id>`
Discovery host	`OLLAMA_HOST`

Aether discovers models you have pulled into Ollama by querying $OLLAMA_HOST/api/tags (http://localhost:11434 by default). To route inference requests to a different host, override the provider URL:

{
  "providers": {
    "ollama": { "url": "http://ollama-host:11434/v1" }
  }
}

llama.cpp


Credentials	None
Model syntax	`llamacpp:<model-id>`
Discovery host	`LLAMA_CPP_HOST`

Aether queries $LLAMA_CPP_HOST/v1/models (http://localhost:8080 by default) for the currently loaded model. To route inference requests to a different host, override the provider URL:

{
  "providers": {
    "llamacpp": { "url": "http://llamacpp-host:8080/v1" }
  }
}

Credentials store

OAuth tokens (such as Codex login) are stored in a credential store. By default Aether uses the OS keyring. Override it with the top-level credentialsStore field:

{
  "credentialsStore": { "type": "keyring" }
}

Type	Properties	Description
`keyring`	—	OS keyring (default).
`memory`	—	In-memory only; cleared when the process exits. Intended for tests.
`encryptedFile`	`path` (optional), `passwordEnv` (optional)	Encrypted file. `path` defaults to `.aether/credentials.enc` in the Aether home; `passwordEnv` names the env var holding the passphrase (defaults to `AETHER_CREDENTIALS_PASSWORD`).

{
  "credentialsStore": {
    "type": "encryptedFile",
    "path": "/secure/aether-credentials.enc",
    "passwordEnv": "AETHER_CREDENTIALS_PASSWORD"
  }
}

Provider overrides

The top-level providers object (or per-agent providers) overrides how Aether connects to a provider. Each entry is keyed by provider name:

Field	Description
`url`	Base URL for API requests. Used by OpenAI-compatible, Foundry, and local providers.
`auth`	Authentication mode: `default` (provider default) or `none` (skip auth). Use `none` only behind a trusted proxy that injects credentials.
`requestModel`	Overrides the model id sent in requests without changing the catalog identity used for capabilities, context window, and display.
`inferenceProfileArn`	Bedrock application inference profile ARN to route requests through.

{
  "providers": {
    "azure-foundry": {
      "url": "https://my-resource.openai.azure.com/openai/v1",
      "requestModel": "production-coding-deployment"
    },
    "bedrock": {
      "inferenceProfileArn": "arn:aws:bedrock:us-west-2:000000000000:application-inference-profile/000000000000"
    }
  }
}

The CLI equivalents are --provider PROVIDER.url=..., --provider PROVIDER.auth=..., --provider PROVIDER.request-model=..., and --provider bedrock.inference-profile-arn=....

Reasoning effort

Some providers support extended thinking. Set reasoningEffort on an agent to request a thinking budget:

Value	Meaning
`"minimal"`	Minimal or no thinking
`"low"`	Minimal thinking
`"medium"`	Moderate thinking
`"high"`	Extended thinking
`"xhigh"`	High thinking budget for providers that expose additional levels
`"max"`	Maximum thinking budget for providers that expose a top level

{
  "agents": [
    {
      "name": "Plan",
      "description": "Plans implementation strategy",
      "model": "codex:gpt-5.5",
      "reasoningEffort": "xhigh",
      "userInvocable": true,
      "prompts": [".aether/PLAN.md"]
    }
  ]
}

Providers that do not support the requested effort ignore it or map it to their closest available setting.

Alloying

An alloy spec is a comma-separated list of model specs. Each conversation turn uses the next provider in the list, cycling round-robin.

{
  "agents": [
    {
      "name": "Alloy",
      "description": "Alternates between cloud and local models",
      "model": "zai:glm-5.1,ollama:llama3.2",
      "userInvocable": true,
      "prompts": [".aether/ALLOY.md"]
    }
  ]
}

Every model in the alloy must be usable in the current runtime environment: required credentials must be set, and local servers must be running.