Everyone reaches for page.locator(".some-class") first. They shouldn't.

getByRole is the most stable selector in Playwright and almost nobody uses it for scraping. They think it's a testing-library thing. It's not. It's a way of asking the page "what is this element semantically" instead of "what classname does the design system happen to use this week."

That distinction is what kept our Facebook video transcript actor running through three Facebook redesigns this past year.

The 3-item checklist

When does getByRole work? When the site is built by people who care about accessibility. Which is: more sites than you think, especially big ones with legal requirements (US government, EU compliance, large e-commerce).