Octomind 0.26.0: Four New Providers, One Clean Architecture

Every AI tool hits the same wall eventually. You start with one provider, one model, one way of working. Then you need to compare outputs across frontiers. Then you need a cheaper fallback for the boring stuff. Then you realize your config has grown into a mess of inline definitions, duplicate tool restrictions, and execution paths that diverged months ago.

We hit that wall too. So 0.26.0 is about expansion and consolidation at the same time.

Four new providers landed this week. Two frontier models. A config architecture that replaces divergent execution paths with a single protocol. Skills that don't ghost you after compression. And your terminal finally handles images the way you'd expect — paste and go, no commands to remember.

Here's what changed.

Four New Providers, Two Frontier Models

Octomind's provider catalog grew fast — Featherless, NVIDIA NIM, Groq, and BytePlus all came online riding three octolib bumps in seven days.

The same bumps brought first-class support for GPT-5.5 and DeepSeek V4. If you've been waiting to try the latest frontier models in a session-based workflow, the path is clear now.

Existing [providers.*] configs need no changes — these are purely additive. Point your role at the new provider name and you're done.

Architecture: Layers Became Roles

This is the biggest structural change in 0.26, and it changes how you write configs.

Before, layers had their own model, prompt, and tool restrictions inside [layers.*]. Each layer wrapped its own AI call. It worked, but it meant two execution paths — interactive sessions and layered pipelines — with different capabilities, different bugs, different behavior.

Now:

AI model config lives in a dedicated [roles.*] section with global output defaults at the top of the file.
Layers run as ACP commands — they delegate to a role via a command and optional workdir.
Per-layer tool restrictions and layer_config are gone. Tool allowance is delegated to the ACP session role config.
LayerResult is stripped to essentials (outputs, total_time_ms). GenericLayer is replaced by LayerProcessor. The deprecated types and generic.rs modules are gone.
A new request_timeout_seconds field caps how long a layer's HTTP request can run.
The acp save and acp cache commands folded into a unified skill command.

Why this matters. A layer is now a thin wrapper that runs your real role under ACP. Everything that works in interactive sessions — MCP servers, skills, tool routing, model fallback — automatically works in layered pipelines too. One execution path. One set of behaviors. One place for bugs to hide, which means fewer of them.

The Breaking Change

The command field is now mandatory for [layers.*]. Old layers that relied on inline model and prompt need to be rewritten as ACP commands targeting a role.

Before:

[[layers]]
name = "review"
model = "openai:gpt-4.1"
prompt = "Review this code for bugs"

After:

[[layers]]
name = "review"
command = "acp run reviewer"

[[roles.reviewer]]
model = "openai:gpt-4.1"
prompt = "Review this code for bugs"

The validation error is loud on purpose. Silently coercing an old config into the new shape would produce confusing failures later. Fail fast, fail clearly.

Skills That Don't Ghost You

In 0.25 we shipped declarative skill activation. In 0.26 we made it robust.

The problem was subtle. When a long session compressed itself, your active skills got dropped from the summarized history. You'd hit /done, the conversation would summarize, and suddenly your auto-activated context was gone. The AI would keep talking, but without the skill's knowledge. You might not even notice — you'd just get worse answers.

The fix is more careful than "preserve everything." Here's what actually changed:

A new CompressionTrigger enum gives the pipeline granular control over what survives which kind of compression — auto, manual, /done, skill forget. Environment-loaded skills are tracked separately from session-active ones, with session-keyed storage in the OCTOMIND_SKILLS registry. After a summary, auto-activated skills are re-triggered rather than just re-injected — they go through the same activation rules they'd face in a fresh session.

Active skill context is preserved during pruning, but skill tags are skipped in summaries. No point wasting tokens on definitions you already have loaded. And skills are deduplicated — only the latest definition survives, so old versions can't pollute your task history.

Practically? Your OCTOMIND_SKILLS=foo,bar env var now does what you'd expect across a multi-hour session. The skills stay. The knowledge stays. The ghosting stops.

Paste an Image, It Just Works

Images in Octomind aren't new. What's new is how you get them in.

Before, you typed /image or some command, navigated a path, confirmed. Now? Hit Ctrl+V. If your clipboard holds an image — or a path to one, or a URL — Octomind grabs it, attaches it as a multimodal blob, and renders an inline preview right above your prompt line. Kitty graphics protocol, iTerm2 inline images, whatever your terminal speaks.

No command to remember. No interruption to your flow.

The details that make it actually work:

Silent rendering. Terminal graphics protocols emit acknowledgement bytes. Handle them naively and they leak into your input buffer, corrupting what you type. We strip them before they touch readline.

Downscaled previews. Raw screenshots are enormous. We resize to 320px max and re-encode before sending — base64 payloads small enough that your prompt doesn't stutter or get clipped.

Non-disruptive feedback. Attachment confirmations route through ExternalPrinter so they don't fight the active prompt redraw. You see it, but it doesn't break your flow.

Graceful fallback. No image on the clipboard? Ctrl+V falls through to normal text paste. Nothing changes.

The help menu lists the shortcut. There's a tip the first time you trigger it.

Skill System: Cleaner Lifecycle, Less Noise

A handful of smaller skill changes that compound into a noticeably better experience:

Lifecycle events over WebSocket. Skill activate / use / forget events now emit to connected WebSocket clients, with plain-text Using skill: lines suppressed in JSONL mode. If you're driving Octomind from another tool, you finally get clean structured signals instead of parsing stderr.

No duplicate skill messages. An active-registry guard prevents the same skill from being re-injected when it's already loaded.

Validation failures route through the inbox for auto-continuation, with a new SkillValidator source for failure tracking. They don't just disappear anymore.

Forgetting a skill is instant. The SkillForget compression trigger and the global state that tracked it are gone. Cleanup is deferred to the next natural compression cycle. Forgetting a skill is now cheap and immediate.

MCP Server Management Got Unified

disable_server, persist_server, and a new get_tools_for_server helper now work identically for dynamically registered servers and config-loaded servers. Previously these had two code paths with subtle differences. Disable now also temporarily strips the server's tools so the LLM can't see stale entries while a server is paused.

`mcp-*.toml` Override Files

You can now drop mcp-foo.toml, mcp-bar.toml files alongside your config and they'll merge after the base config — override fields win for matching server names. One file for mcp-local.toml, another for mcp-prod.toml, without rewriting your full config.

Auto-bind servers are correctly tracked in server_refs to match enabled_servers. They don't leak across role boundaries. Wildcard patterns are added to allowed_tools only in restricted mode. And new tests cover auto-bind injection into role configs, so this won't regress quietly.

Observability and Polish

`/info` Got Smarter

Two additions to the session info command:

Throughput. Tokens-per-second is calculated from API response time and shown as generation speed. Token averages. Average input and output tokens per API request, average tokens for compressions, average tokens for tool calls — all rendered as a formatted "averages" section.

If you've been eyeballing whether a session is bloating, you now have numbers.

Terminal Titles That Make Sense

Octomind now sets process titles for acp and server subcommands, and updates terminal titles dynamically with the session ID. On a workstation with a dozen tabs running different sessions, the title bar finally tells you which one is which. Cross-platform escape-sequence logic handles the differences between terminals.

Shorter Session Names

Session naming changed from a verbose timestamp to YYMMDD-basename-HHMM-uuid4:

Seconds dropped from the timestamp
UUID suffix shortened from 8 to 4 characters
Components reordered to date → basename → time → uuid

Also in this batch: unit tests for config validation and session-name generation, validation logic for layer names / descriptions / commands, and a sweep of copyright headers updated to 2026.

Bug Fixes Worth Calling Out

A few of these fix bugs you may have hit and shrugged off as "the LLM being weird."

Mid-loop tool compression no longer orphans tool calls. Compression was being triggered mid-loop while tool results were still streaming, leaving orphaned tool calls in the message history that the API would reject. The fix defers compression until all tool results in a multi-tool response cycle are processed. If you've ever seen "tool_use without matching tool_result" errors mid-session, this was why.

MCP function parameters always have a type field. LLM providers require function parameter schemas to declare "type": "object". The schemars crate omits this field for tagged enums — it's implicit in the spec but not in the JSON output. We now ensure it's present during conversion. This was a silent compatibility bug with several providers.

Dynamic MCP server enable/disable is honored per session. Previously, toggling a dynamic MCP server on or off worked in some session contexts but not others. Now it's consistent everywhere.

ACP layer commands fail loudly when missing. As mentioned above, missing command fields on [layers.*] now produce an actionable error message during validation, not a confusing runtime failure later.

Upgrade Notes

Coming from 0.25? One breaking change:

[layers.*] configs must include a command field. Old layers with inline model and prompt need to become ACP commands targeting a role. The validation error guides you through it.

Coming from 0.24 or earlier? Also note these from the 0.25 → 0.26 window:

developer:rust is now developer:general — update any references in your config or scripts.
/save is gone. Sessions persist automatically; use /skill for explicit skill management.
Skill activation is declarative now. See the [skills.*] section in your config and the migration guide if you have custom skills.

A couple of subtle behavior changes that aren't strictly breaking but are worth knowing:

Re-test your dynamic MCP server toggles. If you had a workaround for disable/enable not sticking per session, that workaround is no longer needed and may be doing the wrong thing.
Your session list looks different. Names are shorter, UUIDs are 4 chars instead of 8. If you were grepping session names by a specific format, update your patterns.

What's Next

The layers-to-roles migration gives us a single execution path to build on. Here's what we're focusing on next:

Token efficiency. Smarter compression that preserves more signal with fewer tokens. Better pruning of redundant context. We're measuring everything now — the /info averages give us the data to optimize against.

Long-running session stability. The CompressionTrigger system is the foundation, but we want sessions that run for days without degradation. Better summarization strategies, more aggressive deduplication, and smarter decisions about what stays in context.

Tap ecosystem expansion. The community tap has grown well beyond the early developer:general and devops:kubernetes days. We now have security:owasp, content:article, content:blog, content:editor, content:seo, doctor:general, finance:analyst, lawyer:us, lawyer:uk, lawyer:ca, and the full launch suite — bootstrap, brand, explore, pitch, validate. Capabilities have kept pace: filesystem, codesearch, memory, websearch, versioning, plus maps, calendar, market-data, payments, edge-hosting, scraping, messaging-discord, messaging-linkedin, messaging-whatsapp, ecommerce, docker, svelte. The model is working — declare what you need, bin/load resolves providers, and you run. What's missing is the long tail: data:* agents, cloud-specific ops:aws and ops:gcp, more specialized developer variants. The format is built for this. We just need more manifests.

Evaluation and benchmarking. We're done guessing. Which skills get activated but never used? Which providers fail most often? Which compression strategies lose the most useful context? We need systematic benchmarks — not gut feeling — to find the SOTA approaches that actually hold up in long sessions. Real tests, real metrics, real decisions.

And yes, more bug fixes. The orphaned tool calls and MCP type field bugs were silent killers. We're hunting the next class of those.

But right now: four new providers, one clean architecture, skills that stick around, and a terminal that handles images without making you think. That's 0.26.0.

Download 0.26.0 from GitHub or run cargo install octomind --force.

Full changelog: github.com/muvon/octomind/blob/master/CHANGELOG.md

Documentation: octomind.run/docs

Community tap: github.com/muvon/octomind-tap

Octomind is a session-based AI development assistant in Rust. Read more about how skills work in our deep dive on agent skills and auto-activation.