Large Context Windows Are Not a Solved Problem

A year ago, models could hold a million tokens in context, roughly 750,000 words or ten novels. The capability is impressive but often misunderstood, with many teams finding it challenging to effectively utilize large context windows in production.

venerdì 19 giugno 2026 New tab

Originally published on lavkesh.com

I recall the excitement a year ago when models could hold a million tokens in context. That's about 750,000 words or ten average novels sitting in a single prompt. The demos were impressive, and researchers posted benchmarks, but soon teams realized that having a massive context window and knowing what to do with it are two different problems.

I'm not dismissing the capability; a million tokens in context is a real technical achievement. However, I think there's a version of the conversation happening right now that treats window size as the finish line, and that's worth pushing back on.

The pattern I've seen play out is that a team gets access to a long-context model, loads in a large document or codebase, sends a query, and gets back results that are okay, sometimes good, but often frustratingly hard to diagnose. The model technically saw everything in the prompt, but whether it used the right parts is a different question entirely.

Researchers have identified a phenomenon called 'lost in the middle,' where models tend to pay disproportionate attention to content at the beginning and end of a context window, underweighting material in the middle. So if you're feeding in a 200-page document and the critical detail is on page 94, you might not get the answer you're looking for.

Originally published on lavkesh.com

Large Context Windows Are Not a Solved Problem

Large Context Windows Are Not a Solved Problem

Other newsrooms on this story

Related reading

Why a Bigger Context Window Won't Fix Agent Memory

The hidden cost of context windows — why 128k tokens is not free

Long context is not AI memory: a builder playbook for reliable AI apps

How I Compared Context Windows Across 184 LLM Models in 2026

How Memory Sparse Attention scales LLM memory to 100 million tokens - TechTalks

Your context window is not your agent's memory

Other newsrooms on this story

Related reading

Why a Bigger Context Window Won't Fix Agent Memory

The hidden cost of context windows — why 128k tokens is not free

Long context is not AI memory: a builder playbook for reliable AI apps

How I Compared Context Windows Across 184 LLM Models in 2026

How Memory Sparse Attention scales LLM memory to 100 million tokens - TechTalks

Your context window is not your agent's memory