TL;DR

Over 844,000 sites had implemented llms.txt by late 2025 — still under 1% of active websites, meaning early adoption is a differentiator; Cloudflare, Anthropic, Stripe, and Vercel all ship one.

HTML is a lossy agent interface: every step from fetch to context extraction degrades content; llms.txt gives agents a curated structured map, llms-full.txt gives them the entire corpus in one HTTP GET.

Converting HTML to Markdown cuts token usage 68% for clean content and up to 87% for real-world pages — the same content costs dramatically less to process when the delivery format is right.

The ## Permissions block is the clearest machine-readable signal a site owner can give about AI inference-time use; robots.txt was designed for crawling, not for this.