How I Reverse Engineered a Popular AI Extension

TL;DR

Blackbox AI (a VS Code extension with millions of installs) claims free access to premium LLMs like Minimax M2 and Kimi K2.6, but silently routes all free-tier requests to a single Azure OpenAI deployment serving gpt-5.4-nano. The UI presents 25+ model choices; the proxy allowlist admits exactly 3 model strings, all resolving to the same backend. Response headers prove this: identical x-litellm-model-id, x-litellm-model-api-base, and llm_provider-azureml-model-session across all model selections. The backend runs LiteLLM v1.80.11 on Google Cloud Run proxying to Azure OpenAI in Sweden Central. The extension bundles a hidden Electron voice chat app with hardcoded Xirsys TURN credentials and zero anti-tamper protection. Full reproduction commands at the bottom. Verify every claim with curl.

Introduction

AI coding assistants are everywhere. One particularly popular extension caught my eye: Blackbox AI. It boasts millions of installs, a UI with 25+ premium models (GPT-5, Claude Sonnet 4, Grok, Gemini, etc.), and a free tier that specifically touts Minimax M2 and Kimi K2.6 as the incentive.

In the world of AI, compute is not cheap. A free-to-use extension routing thousands of developers to the most expensive LLMs on the planet raises architectural questions. Is this a loss-leader strategy? Are they using quantized local models? Or does a multi-provider gateway sit between the UI and the actual inference?

How I Reverse Engineered a Popular AI Extension

Other newsrooms on this story

Related reading

Why I Replaced Most of My AI Subscriptions With a Mac Mini Running Local LLMs

The AI Crash Test: adversarial LLM testing you can audit in the Network tab

Show dev: I built an AI agent workstation in Nairobi for DeepSeek, Qwen, Kimi &…

I Built an LLM Gateway That Extends Claude Pro/Max Users with Azure AI Foundry,…

Stop Paying $20/month. Use NVIDIA Build: 80+ Free AI Models

Turn ~800M Free AI Tokens Into a Single OpenAI API with FreeLLMAPI