Providers

The Gemini, Anthropic, and OpenAI-compatible Model implementations, their configuration, environment variables, feature flags, and transport-security guarantees.

adk-rs ships three provider clients — Gemini, Anthropic, and OpenAi — each behind its own cargo feature and each implementing the same Model trait. Because agents only see Arc<dyn Model>, switching providers is a one-line change.

At a glance

ProviderFeature flagEnv var(s)supported_models
providers::gemini::GeminigeminiGOOGLE_API_KEYgemini-*
providers::anthropic::AnthropicanthropicANTHROPIC_API_KEYclaude-*
providers::openai::OpenAiopenaiOPENAI_API_KEY, optional OPENAI_BASE_URLopenai/*, gpt-*, o1-*, o3-*, azure/*, ollama/*, groq/*
Cargo.tomltoml
[dependencies]
adk-rs = { version = "0.6", features = ["gemini", "anthropic", "openai"] }

Retries

Every provider client retries transient failures automatically: 429 rate limits (honouring a Retry-After header up to 60s), 408/409/5xx responses, and connect/timeout transport errors. Backoff is exponential with full jitter. The policy lives on each config as a retry: RetryConfig field; the default mirrors the official provider SDKs — 2 retries, 500ms initial backoff, 8s cap. Other 4xx errors (bad request, auth) fail immediately.

Tuning or disabling the retry policyrust
use adk_rs::core::RetryConfig;
use adk_rs::providers::gemini::{Gemini, GeminiConfig};
use std::time::Duration;

let model = Gemini::new("gemini-2.5-flash", GeminiConfig {
    api_key: std::env::var("GOOGLE_API_KEY")?,
    retry: RetryConfig {
        max_retries: 5,
        initial_backoff: Duration::from_millis(250),
        max_backoff: Duration::from_secs(30),
        ..RetryConfig::default()
    },
    // or: retry: RetryConfig::disabled(),
    ..GeminiConfig::default()
})?;

Gemini

The Gemini client speaks the generateContent REST API and real SSE streaming. Gemini::from_env(model_name) reads $GOOGLE_API_KEY; Gemini::new(model_name, GeminiConfig) gives full control. GeminiConfig has five fields: base_url (default https://generativelanguage.googleapis.com), api_version (default v1beta), api_key, timeout (default 60 s), and retry. timeout is the total timeout for non-streaming requests only — streaming (SSE) requests are exempt, with just a 10-second connect timeout, so a long generation is never cut off mid-stream; the same rule applies to all three providers. The API key travels in the x-goog-api-key header.

  • Streamingstream_generate_content POSTs to :streamGenerateContent?alt=sse and decodes the SSE chunks into a stream of LlmResponse values.
  • Server-side built-in tools — when the request config carries Tool::GoogleSearch {}, Tool::UrlContext {}, or Tool::CodeExecution {} (injected by the Gemini built-in tool handles), Gemini runs search grounding, URL grounding, or sandboxed Python on Google’s servers.
  • Context caching — if the request carries a ContextCacheConfig, the client creates a server-side cachedContents entry for the stable prefix (system instruction + tools), reuses it on later calls keyed by a fingerprint, and transparently retries without the cache if the server rejects a stale entry. See Context caching.
  • Live API — with the live feature, Gemini::connect_live opens a bidirectional WebSocket session for realtime text and audio. See Gemini Live.

Anthropic

The Anthropic client targets the Messages API (POST {base_url}/v1/messages) with x-api-key and anthropic-version headers. Anthropic::from_env(model_name) reads $ANTHROPIC_API_KEY; AnthropicConfig exposes base_url (default https://api.anthropic.com), anthropic_version (default 2023-06-01), api_key, timeout, and retry.

  • Streaming — native SSE: text and thinking deltas are emitted as partial chunks the moment they arrive, tool-call arguments accumulate across input_json_delta fragments and surface as one complete FunctionCall, and the final chunk carries the stop reason and usage.
  • Multimodal inputPart::InlineData images become base64 image blocks, inline PDFs become document blocks, and https:// Part::FileData references become URL sources. Unsupported parts are dropped with a warning, never silently.
  • Prompt caching — a ContextCacheConfig on the request becomes a cache_control breakpoint on the system block (or the last tool when there is no system instruction), so Anthropic caches the stable prefix server-side. Cache activity surfaces on event.response.cache_metadata and usage_metadata.cached_content_token_count.
  • Extended thinkingGenerateContentConfig.thinking_config.thinking_budget maps to the Messages API thinking parameter ({"type": "enabled", "budget_tokens": N}). The default max_tokens grows to budget + 2048 so thinking never starves the answer (an explicit max_output_tokens is always respected), and temperature/top_p/top_k are dropped while thinking is enabled — the API rejects them together. Thinking blocks round-trip with their cryptographic signature, redacted_thinking blocks are preserved as Part::RedactedThought, and streaming handles thinking_delta + signature_delta.
  • Forward compatibility — unknown content-block types in responses are skipped instead of failing the whole response; the refusal stop reason maps to FinishReason::Safety and pause_turn to Stop.

OpenAI (and Azure, Ollama, Groq)

The OpenAi client speaks the chat/completions protocol, which makes it the bridge to every OpenAI-compatible endpoint. OpenAi::from_env(model_name) reads $OPENAI_API_KEY and honours $OPENAI_BASE_URL (default https://api.openai.com/v1). OpenAiConfig adds api_version (appended as Azure’s ?api-version= query parameter) and organization (sent as the OpenAI-Organization header). The key travels as Authorization: Bearer ....

  • Streaming — native SSE with stream_options: {include_usage: true}: content deltas stream as partial chunks; fragment-wise tool calls are reassembled by index and emitted complete in the final chunk alongside the finish reason and usage.
  • Multimodal input — text-only user messages stay plain strings; messages with images switch to the content-parts form, mapping inline images to data: URI image_url parts and https:// image references to plain image_url parts.
  • Reasoning modelsmax_output_tokens is sent as max_completion_tokens for the o-series / gpt-5 family (which reject the deprecated max_tokens with a 400) and as max_tokens for everything else, keeping older OpenAI-compatible servers working.
Pointing the same client at different backendsbash
# Stock OpenAI
export OPENAI_API_KEY=sk-...

# Local Ollama (loopback HTTP is allowed)
export OPENAI_BASE_URL=http://localhost:11434/v1
export OPENAI_API_KEY=ollama   # any non-empty value

# Groq
export OPENAI_BASE_URL=https://api.groq.com/openai/v1
export OPENAI_API_KEY=gsk_...

Embedders

Both the Gemini and OpenAI features ship an Embedder implementation for semantic memory: GeminiEmbedder (the batchEmbedContents API, e.g. gemini-embedding-001) and OpenAiEmbedder (the /embeddings endpoint, e.g. text-embedding-3-small — also reaches Azure and Ollama via OPENAI_BASE_URL). Each has the same from_env / config constructors as its chat sibling and shares the retry policy. Plug either into VectorMemoryService.

Transport security: HTTPS or loopback

Every provider constructor validates its base URL with transport_security::require_secure_url before building the HTTP client. The rule: the destination must be https://, or a plaintext-HTTP loopback host (localhost, any 127.0.0.0/8 address, or [::1]). A public http:// base URL is rejected with a configuration error rather than silently shipping your API key in cleartext. Loopback stays allowed so local mocks, Ollama, and test servers keep working. All three clients also disable HTTP redirects (redirect::Policy::none()) — reqwest re-sends custom headers on redirect, so a redirecting endpoint could otherwise exfiltrate the API key to another host.

Rejected at construction timerust
use adk_rs::providers::openai::{OpenAi, OpenAiConfig};

let err = OpenAi::new(
    "gpt-4o-mini",
    OpenAiConfig {
        base_url: "http://api.example.com/v1".into(), // plaintext, non-loopback
        api_key: "sk-...".into(),
        ..OpenAiConfig::default()
    },
)
.unwrap_err(); // "base_url must be https:// or point to a loopback host ..."

Swapping providers behind Arc<dyn Model>

The pattern from the repo’s three_providers example: build whichever clients have credentials, then hand any of them to the same agent.

examples/three_providers.rs (condensed)rust
use adk_rs::agents::LlmAgent;
use adk_rs::core::Model;
use adk_rs::providers::anthropic::Anthropic;
use adk_rs::providers::gemini::Gemini;
use adk_rs::providers::openai::OpenAi;
use std::sync::Arc;

let mut models: Vec<(&str, Arc<dyn Model>)> = Vec::new();
if std::env::var("GOOGLE_API_KEY").is_ok() {
    models.push(("Gemini", Arc::new(Gemini::from_env("gemini-2.5-flash")?)));
}
if std::env::var("ANTHROPIC_API_KEY").is_ok() {
    models.push(("Claude", Arc::new(Anthropic::from_env("claude-3-5-sonnet")?)));
}
if std::env::var("OPENAI_API_KEY").is_ok() {
    models.push(("OpenAI", Arc::new(OpenAi::from_env("gpt-4o-mini")?)));
}

for (label, model) in models {
    let agent = LlmAgent::builder("greeter")
        .model(model) // same builder, any provider
        .instruction("Be concise.")
        .build()?;
    println!("=== {label} ===");
    // ... run via Runner as usual
}