Taxonomy Surgery, Cosine = 1.0000, and Making Routing Disappear into Infrastructure

This is part 3 of the Adaptive Model Routing series. Part 1 built an LLM categorizer with Groq — 8...

venerdì 5 giugno 2026 New tab

1,181 words~5 min read

This is part 3 of the Adaptive Model Routing series. Part 1 built an LLM categorizer with Groq — 8 categories, 3 tiers. Part 2 added k-NN embedding lookup in shadow mode, discovered 83% tier accuracy, and found 61% cost savings on paper. This post covers what happened next.

When Phase 2 ended, I had a working embedding pool in shadow mode inside crab-bot. The category accuracy was sitting at 78.6%. Not bad — but the breakdown hid something worth looking at.

Phase 3: When Validation Tells You a Category Doesn't Need to Exist

The leave-one-out accuracy by category told the real story:

Taxonomy Surgery, Cosine = 1.0000, and Making Routing Disappear into Infrastructure

Taxonomy Surgery, Cosine = 1.0000, and Making Routing Disappear into Infrastructure

Other newsrooms on this story

Related reading

Phase 2 Shipped: 5 Things I Got Wrong About Embedding-Based Routing

SynaptoRoute v0.3.0: Matching Semantic Router While Scaling to 50,000 Routes

SynaptoRoute: A Study in Local Semantic Routing

SynaptoRoute v0.4.0: Re-Architecting for Massive Concurrency & Zero-Downtime…

How to Build a Cost-Aware LLM Routing System with NadirClaw Using Local Prompt…

New 1.5B router model achieves 93% accuracy without costly retraining

Related reading

Phase 2 Shipped: 5 Things I Got Wrong About Embedding-Based Routing

SynaptoRoute v0.3.0: Matching Semantic Router While Scaling to 50,000 Routes

SynaptoRoute: A Study in Local Semantic Routing

SynaptoRoute v0.4.0: Re-Architecting for Massive Concurrency & Zero-Downtime…

How to Build a Cost-Aware LLM Routing System with NadirClaw Using Local Prompt…

New 1.5B router model achieves 93% accuracy without costly retraining

Other newsrooms on this story