2026-06-15 · 5 min read

Don't block AI crawlers — charge them

Search "how to block AI crawlers" and you'll find a hundred guides that all say the same thing: add GPTBot, ClaudeBot, CCBot, PerplexityBot, and Bytespider to your robots.txt, maybe a few firewall rules, and call it done. That advice isn't wrong. It's just answering the wrong question.

The real question isn't "how do I stop AI from reading my content?" It's "AI is going to read my content either way — do I want to be paid for it?" Blocking and allowing are the two choices everyone talks about. There's a third, and in 2026 it's the one that pays.

The choice is not block-or-allow. It's block, allow, or charge.

Look at what each option actually does:

  • Block. You keep your content, earn nothing, and lose the reach. The crawlers that do send traffic back — the answer and citation bots that surface you inside AI results — go away with the ones that don't.
  • Allow. You keep the reach, earn nothing, and watch training-scale crawlers pull thousands of pages for free.
  • Charge. The crawler gets a price instead of a free copy. It can pay and proceed, or skip the page. You're paid per request, and you stop subsidizing companies with billion-dollar training budgets.

Most "block AI bots" advice never mentions the third column, because until recently it wasn't practical. It is now.

Not all crawlers are the same — and that's the whole point

Lumping every bot into one Disallow line is the mistake. There are two very different populations hitting your site:

  • Training and scraping crawlersGPTBot, CCBot (Common Crawl), Bytespider, Google-Extended and friends. They consume your pages to build datasets. Published crawl-to-referral analyses put some of these in the tens-of-thousands-of-fetches-per-visitor-referred range — and a few send no traffic back at all. This is pure extraction.
  • Answer and citation bots — the user agents that fetch a page because someone asked a question right now and may cite or link you in the response. These can actually send readers.

Blocking everything kills the second group to stop the first. Charging lets you treat them differently: price the extraction, stay discoverable to the citation traffic, and let each crawler decide whether your content is worth the quote. You're not picking between reach and revenue anymore — you're pricing the difference.

Why robots.txt was never going to be enough

robots.txt is a polite request. It's a text file that asks well-behaved bots not to crawl, and it works exactly as well as the bot's operator decides to let it. Some honor it. Some ignore it. Some spin up a new user-agent string the week after you block the old one. You can't enforce a Disallow you can't see being violated.

Charging is enforced at a different layer. When the decision happens at the edge — before the request ever reaches your origin — a crawler doesn't get to choose whether to respect a suggestion. It gets an HTTP 402 Payment Required response with a price attached. No payment, no content. That's not a request; it's a turnstile.

How charging a crawler actually works

The mechanism is the open x402 protocol, and the shape is simple:

  1. A crawler requests a page.
  2. The edge checks your rules. Human visitor? Pass through, untouched. Crawler hitting a priced path? Return a signed 402 challenge quoting the price.
  3. The crawler's runtime reads the price, pays in USDC, and retries with proof.
  4. The edge verifies the payment and serves the page. Earnings land in your wallet automatically — no invoices, no accounts to provision for callers you'll never meet.

For a content site, the part that matters most: no code changes. You connect Cloudflare, a Worker enforces the rules at the edge, and your origin doesn't change. You write pricing rules — by path, by crawler, by anything you can match on — and the edge does the rest. The full walkthrough is in how Paywall charges AI crawlers for site owners.

"But will they actually pay?"

The honest answer: some will, some won't, and that's fine — that's what a price is for. A crawler that won't pay a tenth of a cent for your page was never going to be a customer; you've simply stopped giving it the page for free. The ones that pay are telling you your content has value to them, in the only language that's unambiguous.

It's also worth being clear-eyed about scale. Per-page prices are small, and for a tiny site the early numbers can look like a tip jar. The case for charging isn't "get rich tomorrow" — it's "stop subsidizing extraction, keep your citation reach, and own a revenue line that grows with the traffic." We dug into the skeptics' version of this argument honestly in is x402 worth it yet?.

Where to start

If your first instinct was to block, do this instead:

  1. Decide what's worth charging for. Usually it's your highest-value, most-scraped content — not your contact page.
  2. Set a price per request. Start low; you can iterate. Run it in test mode first so you can watch the whole loop without real money moving.
  3. Keep humans free. The rules match crawlers, not your readers.
  4. Watch the dashboard, then tune the price by path and by crawler.

Blocking is the move that feels safe and earns nothing. Charging is the move that keeps you discoverable and turns the same traffic into a stream of payments.

Next steps