Echo: results so far

Echo: results so far Routing LLM requests cheaply without training a router — and the...

giovedì 11 giugno 2026 New tab

TL;DRAI

Echo routes without classifier: dual cheap model (personas), agreement → cheap else Sonnet. HumanEval hard: 94% oracle, 29% cost savings. Zero training; immediate deployment on workloads, eliminates adoption bottleneck traditional routers.

1,424 words~6 min read

Echo: results so far

Routing LLM requests cheaply without training a router — and the measurement bug that nearly fooled us.

By Nick Meinhold, Robin Langer, Meghana Ganapa, and Adarsha Aryal · 10 June 2026

TL;DR

The idea: instead of training a classifier to route easy tasks to a cheap model and hard ones to an expensive model, call the cheap model twice with two different personas. If the answers agree, keep the cheap one; if they disagree, escalate. No classifier, no labels.

Echo: results so far

Echo: results so far

Other newsrooms on this story

Related reading

Not Every Prompt Needs Your Most Expensive Model – LLM Classifier in PHP

New 1.5B router model achieves 93% accuracy without costly retraining

LLM Gateways: Routing, Fallbacks, And Semantic Caching

Drift Detection for LLM Routing: Catching Silent Model Degradation

Echo grows up and becomes Hey, Reachy

Phase 2 Shipped: 5 Things I Got Wrong About Embedding-Based Routing

Other newsrooms on this story

Related reading

Not Every Prompt Needs Your Most Expensive Model – LLM Classifier in PHP

New 1.5B router model achieves 93% accuracy without costly retraining

LLM Gateways: Routing, Fallbacks, And Semantic Caching

Drift Detection for LLM Routing: Catching Silent Model Degradation

Echo grows up and becomes Hey, Reachy

Phase 2 Shipped: 5 Things I Got Wrong About Embedding-Based Routing