I don't trust the LLM to classify my email. So I don't let it.

My classifier calls an LLM on every single email. The LLM is not allowed to classify the email. That...

giovedì 25 giugno 2026 New tab

1,098 words~5 min read

My classifier calls an LLM on every single email. The LLM is not allowed to classify the email.

That sounds like a contradiction. It's the most important design decision in the thing.

A reader named @nazar_boyko left a comment on my last post — the one where a cheap model beat GPT-4o on email triage — and put it better than I did:

Once the LLM is a feature scorer and not the decider, "consistency over genius" falls right out of it, and a cheap fast model is exactly what you want for reading the same four signals the same way every time.

The price upset was the fun headline. This is the actual thesis. So here it is on its own.

I don't trust the LLM to classify my email. So I don't let it.

I don't trust the LLM to classify my email. So I don't let it.

Related reading

Confidence is enough to decide. It's not enough to do.

One Ruler to Measure Them All: How Language Affects LLM Quality

How LLM Tokens Work (And Why They Explain Your AI Bill)

Two months building an investment bot. What it taught me about LLMs

Why I Don’t Let the LLM Decide Issue State

The Auditor's AI Workflow: How I Use LLMs Without Trusting Them

Related reading

Confidence is enough to decide. It's not enough to do.

One Ruler to Measure Them All: How Language Affects LLM Quality

How LLM Tokens Work (And Why They Explain Your AI Bill)

Two months building an investment bot. What it taught me about LLMs

Why I Don’t Let the LLM Decide Issue State

The Auditor's AI Workflow: How I Use LLMs Without Trusting Them