Mixing LLM Providers Inside a Neuron AI Agent

When I started the v3 of Neuron AI, the first big decision I had to make was not about agents or tools, but about messages. Each LLM provider has its own way of describing a conversation: OpenAI uses one shape, Anthropic another, Gemini and Ollama add their own variations on top. I could have written thin wrappers and let each provider speak its native dialect, pushing the complexity back to the application developer. Instead, I spent a lot of time on what I now call the Unified Messaging Layer: a single representation of messages, content blocks, and tools, that every provider knows how to translate into its own format.

That work felt almost invisible from the outside. People want to see agents, RAG, workflows, the visible parts of a framework. A messaging layer is plumbing, and plumbing is boring until the day it lets you do something you didn't plan for. Last week, while sketching out a few changes requested by developers running production agents, I realized that this old design choice had quietly enabled a feature I hadn’t explicitly designed: routing a single inference call to different providers, transparently to the agent itself.

That’s what the new neuron-core/router package is. It exposes a RouterProvider that implements AIProviderInterface, the same contract every Neuron provider implements. From the agent’s perspective, it is just another provider. Under the hood, every call to chat(), stream(), or structured() is delegated to one of several registered providers, chosen by a routing rule you control.