Meet Harness-1: A 20B Retrieval Subagent Trained With Reinforcement Learning Inside a Stateful Search Harness on gpt-oss-20b

Most search agents are trained as policies over a growing transcript. The model decides how to search. It must also remember what it saw, which evidence matters, and which claims it checked. A team of researchers from University of Illinois Urbana-Champaign, UC Berkeley, and Chroma argues this asks too much. Reinforcement learning ends up optimizing both search decisions and routine bookkeeping at once.

Their answer is Harness-1, a 20B retrieval subagent built on gpt-oss-20b. It was trained with reinforcement learning inside a stateful search harness. The harness holds the bookkeeping. The policy keeps the semantic decisions. The weights and harness code are publicly released.

https://arxiv.org/pdf/2606.02373

What is Harness-1 Actually

Harness-1 produces a ranked set of documents for a downstream answering model. It does not answer questions itself. It runs inside a state-machine harness centered on a per-episode WORKINGMEMORY.

https://arxiv.org/pdf/2606.02373

What is Harness-1 Actually

Harness-1 produces a ranked set of documents for a downstream answering model. It does not answer questions itself. It runs inside a state-machine harness centered on a per-episode WORKINGMEMORY.

Meet Harness-1: A 20B Retrieval Subagent Trained With Reinforcement Learning Inside a Stateful Search Harness on gpt-oss-20b

Meet Harness-1: A 20B Retrieval Subagent Trained With Reinforcement Learning Inside a Stateful Search Harness on gpt-oss-20b

Other newsrooms on this story

Related reading

Researchers trained an open source AI search agent, Harness-1, that outperforms…

Continually improving our agent harness · Cursor

Add a Specialized Deep Research Skill to Agent Harnesses | NVIDIA Technical Blog

DeepSeek-V4: a million-token context that agents can actually use

Towards self-driving codebases · Cursor

How Far Can a Small Coding Model Go With a Better Harness?

Other newsrooms on this story

Related reading

Researchers trained an open source AI search agent, Harness-1, that outperforms…

Continually improving our agent harness · Cursor

Add a Specialized Deep Research Skill to Agent Harnesses | NVIDIA Technical Blog

DeepSeek-V4: a million-token context that agents can actually use

Towards self-driving codebases · Cursor

How Far Can a Small Coding Model Go With a Better Harness?