Octolib
Multi-Provider • Embeddings • Tool Calling • Cost Tracking
One type-safe Rust API for 13+ AI providers. Stop rewriting integration code when switching models. Unified interface for completions, embeddings, reranking, tool calling, vision, and cost tracking — across OpenAI, Anthropic, Google, and beyond.
Key Features
13+ Providers, One Interface
OpenAI, Anthropic, Google, OpenRouter, DeepSeek, Moonshot, MiniMax, Z.ai, Bedrock, Cloudflare, Cerebras, Ollama, Together — and CLI proxies for Codex, Claude, Gemini, Cursor. 37% of enterprises use 5+ models. Stop maintaining separate integrations for each.
Complete Feature Coverage
Structured JSON output with schema validation, vision support for images and video, cross-provider tool calling with automatic parameter extraction, thinking/reasoning for o-series and MiniMax models, OAuth authentication, and automatic cost tracking per request.
Embeddings & Reranking
8 embedding providers (Jina, Voyage, Google, OpenAI, Together, OctoHub, FastEmbed, HuggingFace) and 6 reranking providers (Voyage, Cohere, Jina, Mixedbread, FastEmbed, HuggingFace). Batch processing, token limits, and input type support. Local or API-based.
Why Octolib?
Provider lock-in makes switching models painful
Only 11% of enterprise teams switched LLM vendors last year — not because they didn't want to, but because migration is painful. Octolib makes it a one-line change.
Each provider has different API formats and quirks
Unified type-safe interface handles format differences, error handling, retry logic, and authentication. Write once, deploy to any provider.
No visibility into AI spending across providers
Automatic cost tracking with token breakdown: input, output, cache read/write, reasoning tokens. Know exactly what each request costs across every provider.
When to Use Octolib
Multi-Model Applications
Use GPT-4o for complex reasoning, Claude for long context, and Gemini for vision — all through the same API. Switch models by changing a string, not rewriting integration code.
Cost-Optimized Routing
Automatic token usage and cost calculation per request. Route cheap queries to affordable models, complex ones to capable models. Track spend across all providers in one place.
Embedding Pipelines
Build RAG systems with any embedding provider. Batch processing with configurable sizes and token limits. Swap between local FastEmbed and API-based Voyage without changing application code.
Tool-Calling Agents
Cross-provider standardized tool calling with automatic JSON Schema parameter extraction. Build agents once, run them on any provider that supports function calling.
Works With
Install
Build from Source
cargo add octolibTech Stack
Built in the Open
Octolib is open source under the Apache 2.0 license. Contributions, issues, and stars are welcome.