AI Providers

Octomind supports 7 AI providers through a unified interface. Provider support is implemented in [octolib](https://github.com/muvon/octolib) -- new pr

All provider types are re-exported from src/providers.rs:

  • AiProvider (trait), ProviderFactory
  • OpenRouterProvider, OpenAiProvider, AnthropicProvider, GoogleVertexProvider
  • AmazonBedrockProvider, CloudflareWorkersAiProvider, DeepSeekProvider
  • GenericToolCall, StructuredOutputRequest — tool/schema types
  • ModelPricing, ProviderExchange, ThinkingBlock, TokenUsage — metadata types

Model Format

All models use provider:model format:

model = "openrouter:anthropic/claude-sonnet-4"
model = "openai:gpt-4o"
model = "anthropic:claude-sonnet-4"
model = "deepseek:deepseek-chat"

Supported Providers

OpenRouter (recommended)

Access many providers through a single API key. Best for flexibility and model switching.

export OPENROUTER_API_KEY="your_key"
model = "openrouter:anthropic/claude-sonnet-4"
model = "openrouter:openai/gpt-4o"
model = "openrouter:google/gemini-2.5-flash-preview"

Get a key at openrouter.ai.

OpenAI

Direct access to GPT models.

export OPENAI_API_KEY="your_key"
model = "openai:gpt-4o"
model = "openai:gpt-4o-mini"

Anthropic

Direct access to Claude models. Supports prompt caching for cost savings.

export ANTHROPIC_API_KEY="your_key"
model = "anthropic:claude-sonnet-4"
model = "anthropic:claude-haiku-4-5"

Google (Vertex AI)

Gemini models via Google Cloud.

export GOOGLE_APPLICATION_CREDENTIALS="/path/to/credentials.json"
model = "google:gemini-2.5-flash-preview"

Requires a Google Cloud project with Vertex AI API enabled.

Amazon (Bedrock)

AWS-hosted models.

export AWS_ACCESS_KEY_ID="your_key"
export AWS_SECRET_ACCESS_KEY="your_secret"
export AWS_REGION="us-east-1"
model = "amazon:anthropic.claude-v2"

Cloudflare (Workers AI)

Edge inference with low latency.

export CLOUDFLARE_API_TOKEN="your_token"
model = "cloudflare:@cf/meta/llama-3.1-8b-instruct"

DeepSeek

Cost-effective models.

export DEEPSEEK_API_KEY="your_key"
model = "deepseek:deepseek-chat"

Provider Comparison

ProviderFormatCachingVisionStructured Output
OpenRouteropenrouter:provider/modelYes (model-dependent)YesModel-dependent
OpenAIopenai:modelNoYes (GPT-4o+)Yes
Anthropicanthropic:modelYes (Claude 3.5+)Yes (Claude 3+)Yes (tool-based)
Googlegoogle:modelNoYes (Gemini 1.5+)No
Amazonamazon:modelNoYes (Claude models)No
Cloudflarecloudflare:modelNoLimitedNo
DeepSeekdeepseek:modelNoNoNo

Model Selection Strategy

Use CaseRecommendedWhy
Main developmentanthropic:claude-sonnet-4Best coding, caching support
Fast queries / layersopenai:gpt-4o-miniFast, cheap
Compression decisionsanthropic:claude-haiku-4-510x cheaper than Sonnet
Research / explorationopenrouter:google/gemini-2.5-flash-previewLarge context, fast
Cost-effectivedeepseek:deepseek-chatLowest cost

Prompt Caching

Providers with caching support can reduce costs by caching repeated context:

  • Anthropic: Automatic for Claude 3.5+ models. Cache write at 1.25x, read at 0.1x cost.
  • OpenRouter: Depends on underlying model.

Caching is always-on for supporting providers — the cache marker is moved to the latest message on every turn and uses the provider's long (1h) TTL. No configuration required.

Cost Tracking

Every request tracks token usage and cost:

/info     # Session overview
/report   # Detailed per-request breakdown

Set spending limits:

max_session_spending_threshold = 5.0   # USD per session
max_request_spending_threshold = 1.0   # USD per request

Switching Models

Change model mid-session:

/model openai:gpt-4o
/model anthropic:claude-sonnet-4

Or override at startup:

octomind run -m anthropic:claude-sonnet-4

Troubleshooting

"Invalid model format": Must be provider:model. Example: openrouter:anthropic/claude-sonnet-4.

"API key not found": Set the provider's environment variable. Use octomind config --show to check.

"Provider does not support structured output": Not all providers support structured output. Use OpenAI or Anthropic for structured output.

Google Vertex AI issues: Ensure GOOGLE_APPLICATION_CREDENTIALS points to a valid JSON file and the Vertex AI API is enabled.