Provider Interface
The llm crate provides a unified interface for streaming LLM responses across providers. The core abstraction is the StreamingModelProvider trait.
[dependencies]llm = "0.1"StreamingModelProvider
Section titled “StreamingModelProvider”pub trait StreamingModelProvider: Send + Sync { fn stream_response(&self, context: &Context) -> LlmResponseStream; fn display_name(&self) -> String; fn context_window(&self) -> Option<u32>; fn model(&self) -> Option<LlmModel> { None }}| Method | Description |
|---|---|
stream_response(context) | Stream a response for the given conversation context |
display_name() | Human-readable provider + model name |
context_window() | Max context size in tokens (if known) |
model() | The model catalog entry (optional) |
LlmResponseStream is Pin<Box<dyn Stream<Item = Result<LlmResponse>> + Send>>.
Creating a provider
Section titled “Creating a provider”Use ProviderFactory to create providers from environment variables:
use llm::{ProviderFactory, providers::AnthropicProvider};
let provider = AnthropicProvider::from_env().await? .with_model("claude-sonnet-4-5");Available providers: AnthropicProvider, OpenAiProvider, OpenRouterProvider, GeminiProvider, OllamaProvider, LlamaCppProvider, BedrockProvider (feature: bedrock), CodexProvider (feature: codex).
Core types
Section titled “Core types”Context
Section titled “Context”The conversation state passed to stream_response:
let mut ctx = Context::new(messages, tools);ctx.set_reasoning_effort(Some(ReasoningEffort::High));ctx.set_prompt_cache_key(Some("my-cache".into()));| Method | Description |
|---|---|
new(messages, tools) | Create from messages and tool definitions |
add_message(msg) | Append a message |
set_tools(tools) | Replace tool definitions |
set_reasoning_effort(effort) | Set thinking budget |
messages() | Get all messages |
tools() | Get tool definitions |
estimated_token_count() | Rough token estimate |
ChatMessage
Section titled “ChatMessage”pub enum ChatMessage { System { content, timestamp }, User { content, timestamp }, Assistant { content, reasoning, timestamp, tool_calls }, ToolCallResult(Result<ToolCallResult, ToolCallError>), Error { message, timestamp }, Summary { content, timestamp, messages_compacted },}LlmResponse
Section titled “LlmResponse”Chunks emitted by stream_response:
| Variant | Description |
|---|---|
Start { message_id } | New response started |
Text { chunk } | Text content chunk |
Reasoning { chunk } | Extended thinking chunk |
ToolRequestStart { id, name } | Tool call beginning |
ToolRequestArg { id, chunk } | Streaming tool arguments |
ToolRequestComplete { tool_call } | Tool call fully formed |
Usage { input_tokens, output_tokens, cached_input_tokens } | Token usage |
Done { stop_reason } | Response complete |
Error { message } | Error during streaming |
StopReason
Section titled “StopReason”Why the model stopped generating:
EndTurn, Length, ToolCalls, ContentFilter, Error, Unknown(String)
ToolDefinition
Section titled “ToolDefinition”pub struct ToolDefinition { pub name: String, pub description: String, pub parameters: String, // JSON Schema pub server: Option<String>,}Error handling
Section titled “Error handling”pub enum LlmError { MissingApiKey(String), InvalidApiKey(String), ApiRequest(String), ApiError(String), ContextOverflow(ContextOverflowError), IoError(String), JsonParsing(String), ToolParameterParsing { tool_name, error }, OAuthError(String), Other(String),}ContextOverflowError provides requested_tokens and max_tokens for handling context limits.