Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4's 33.5%

Most web agents today drive a browser one action at a time. The model receives the current page state — as a screenshot or DOM text — and predicts the next click, keypress, or scroll. This action-at-a-time design made sense when language models had limited reasoning ability. As models have become more capable at writing and debugging code, that rigid loop has become a constraint rather than a structure that helps.

Microsoft Research’s AI Frontiers lab built a different approach. Their new open-source framework, Webwright, gives the agent a terminal instead of a stateful browser session. The agent writes Playwright code to control browsers, runs bash commands, inspects logs, and iteratively refines scripts. Playwright is an open-source browser automation library, also from Microsoft, that supports programmatic control of Chromium, Firefox, and WebKit browsers.

Webwright separates the agent from the browser and treats the browser as something the agent can launch, inspect, and discard while developing a program. The persistent artifact is not the browser session but the code and logs in the local workspace.

This is the same model a developer uses when writing an RPA (Robotic Process Automation) script. Instead of manually clicking through a site each time, they write a script once. That script can be rerun, adapted, and shared. Webwright applies this to LLM-powered agents.

Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4's 33.5%

Microsoft Research Releases Webwright: A Terminal-Native Web Agent Framework That Scores 60.1% on Odysseys, Up from Base GPT-5.4's 33.5%

Other newsrooms on this story

Related reading

Microsoft's Fara1.5 AI outperforms OpenAI and Google in web tasks

Microsoft Just Embarrassed Browser Web Agents — 1,000 Lines Made GPT-5.4 Beat…

I made my portfolio readable by AI agents. A scanner found four things that…

Microsoft Releases Fara1.5: A Family of Browser Computer-Use Agents (4B/9B/27B)…

New benchmark shows Claude Mythos and GPT-5.5 can develop real browser exploits…

Meet WebBrain: An Open-Source, Local-First AI Browser Agent That Reads Pages…

Other newsrooms on this story

Related reading

Microsoft's Fara1.5 AI outperforms OpenAI and Google in web tasks

Microsoft Just Embarrassed Browser Web Agents — 1,000 Lines Made GPT-5.4 Beat…

I made my portfolio readable by AI agents. A scanner found four things that…

Microsoft Releases Fara1.5: A Family of Browser Computer-Use Agents (4B/9B/27B)…

New benchmark shows Claude Mythos and GPT-5.5 can develop real browser exploits…

Meet WebBrain: An Open-Source, Local-First AI Browser Agent That Reads Pages…