TL;DR
Blackbox AI (a VS Code extension with millions of installs) claims free access to premium LLMs like Minimax M2 and Kimi K2.6, but silently routes all free-tier requests to a single Azure OpenAI deployment serving gpt-5.4-nano. The UI presents 25+ model choices; the proxy allowlist admits exactly 3 model strings, all resolving to the same backend. Response headers prove this: identical x-litellm-model-id, x-litellm-model-api-base, and llm_provider-azureml-model-session across all model selections. The backend runs LiteLLM v1.80.11 on Google Cloud Run proxying to Azure OpenAI in Sweden Central. The extension bundles a hidden Electron voice chat app with hardcoded Xirsys TURN credentials and zero anti-tamper protection. Full reproduction commands at the bottom. Verify every claim with curl.
Introduction
AI coding assistants are everywhere. One particularly popular extension caught my eye: Blackbox AI. It boasts millions of installs, a UI with 25+ premium models (GPT-5, Claude Sonnet 4, Grok, Gemini, etc.), and a free tier that specifically touts Minimax M2 and Kimi K2.6 as the incentive.
In the world of AI, compute is not cheap. A free-to-use extension routing thousands of developers to the most expensive LLMs on the planet raises architectural questions. Is this a loss-leader strategy? Are they using quantized local models? Or does a multi-provider gateway sit between the UI and the actual inference?






