The earlier posts in this series were about what the gateway lets you call (cache-aware spawning across five providers, the Codex review gate, the CLI-versus-API argument) and the one before this was about the parts that do not show up as a tool, the upstream-tracking and the website that became the project's front door. This one is about a different kind of front door: the gateway can now listen over HTTP, behind real authentication, and serve more than one caller without those callers being able to read each other's work. That sounds like a small toggle. It is not. Moving an MCP server off a local pipe and onto a network port changes the trust boundary completely, and 2.9.0 is the release where I sat down and remediated all seventeen findings from a multi-LLM red-team of exactly that surface before telling anyone the remote path was ready.
Short version: llm-cli-gateway is one Model Context Protocol server that wraps five vendor CLIs (Claude Code 2.1.177, Codex 0.139.0, Google's Antigravity agy 1.0.8, Grok 0.2.51, and Mistral Vibe 2.14.1) behind a single, uniform tool surface, so one orchestrating agent can fan a task out to several models, collect independent opinions, run a red-team or a consensus check, and keep durable session and job state across all of it. Until recently that only made sense on localhost over stdio. As of 2.9.0 the same server runs over HTTP with a static bearer token or a built-in OAuth 2.0 authorisation server (PKCE on by default, an opt-in human-consent gate, and a trusted-principal-header seam for when you front it with your own identity-aware proxy), every session and job and stored request is stamped with an owner principal and access is enforced per principal, remote provider calls are refused unless a workspace is registered, and the whole thing fails closed rather than open when the configuration is dangerous.







