Stop scraping the page when the data is already in the network tab

Browser automation is often the wrong layer for structured web data. Inspect background requests first, then decide if you need a browser.

giovedì 18 giugno 2026 New tab

951 words~4 min read

You write a scraper with Playwright, wait for the page to load, close the cookie banner, click a filter, and parse a table out of the DOM. Then someone redesigns the page and your selector breaks. The annoying part is that the data probably never lived in the HTML in the first place.

Most modern websites render a UI around structured background requests. The browser loads the shell, runs JavaScript, and calls internal endpoints for prices, availability, inventory, search results, profile data, or whatever the page needs. If you scrape the rendered page, you often process hundreds of kilobytes of layout and tracking code to recover a few kilobytes of JSON.

Look at the network layer before writing browser code

Before reaching for Playwright or Puppeteer, open DevTools and check what the site actually does.

In Chrome:

Other newsrooms on this story

· 1 sources

Full timeline →

machinelearningmastery.com·Jun 22, 2026 · 11 g fa
Building Browser-Using AI Agents in Python - MachineLearningMastery.com

Stop scraping the page when the data is already in the network tab

Other newsrooms on this story

Stop scraping the page when the data is already in the network tab

Other newsrooms on this story

Related reading

A 10-Line Playwright Trick That Saved Me Hours on Every Sephora Run

When Traditional Web Scraping Fails: A Practical AI Approach

Your AI agent isn't scraping; it's just failing to read.

I built a Claude browser agent that automates Playwright tasks — here's the…

Agentic Web Browsing Workflows with Python and Playwright

How to Scrape E-Commerce Sites for AI Agents Using Playwright and LLMs

Related reading

A 10-Line Playwright Trick That Saved Me Hours on Every Sephora Run

When Traditional Web Scraping Fails: A Practical AI Approach

Your AI agent isn't scraping; it's just failing to read.

I built a Claude browser agent that automates Playwright tasks — here's the…

Agentic Web Browsing Workflows with Python and Playwright

How to Scrape E-Commerce Sites for AI Agents Using Playwright and LLMs