Join us Dec. 1-3 in New Orleans for the Digiday Programmatic Marketing Summit
A series talking to C-suite executives about the strategies guiding their bottom lines. More from the series →
The Atlantic has built a scorecard for AI crawlers, identifying which bots actually send readers back and which just strip content. Only those with value get through.
This approach led it to block a single AI crawler that tried to recrawl its site 564,000 times in the past seven days.
Some publishers have taken a hard line on AI crawlers, blocking all that they don’t have a licensing deal with. The Atlantic has a licensing deal with OpenAI, which it doesn’t block, but has determined that other AI crawlers must drive back traffic or bring in new subscribers for it to unblock them. The hope is that if the AI engines want access to its content to improve their LLMs’ outputs, then they’ll pay a licensing fee.
“Most of the AI platforms drive almost no traffic, and that’s by design. And this is the fundamental question for media and the AI companies: will the search platforms evolve in such a way that they drive meaningful traffic or meaningful value? Because right now, they don’t,” Nick Thompson, CEO of The Atlantic, told Digiday.
“They’re not highlighting the source material. They’re not really driving people to the sites. And so the amount of traffic you get is de minimis… [so] the number of subscribers will be quite low too. We’re not getting meaningful subscribers from the [AI bots] we’ve blocked,” he said. The Atlantic has over 1 million paid subscribers.
The Atlantic kicked off this AI bot-blocking rating system this summer, when Thompson and chief product officer Gitesh Gohel started tracking how much their site was being scraped by AI crawlers without their permission. They used Cloudflare’s tool, which had launched three weeks earlier, giving all its publisher customers the ability to block AI crawlers by default. They charted in a spreadsheet which crawlers were hitting its site, and which led to referral traffic and subscription conversions.
”It was hard for us to block [AI bots] and follow them. They use headless scrapers. They use third-party scrapers. They do all of this stuff to make it hard for you to follow that. Cloudflare is in the business of thinking this stuff out,” Thompson said.
The AI bot blocking calculation: looking for traffic or subscriptions
Thompson and Gohel meet every week to discuss how the AI bots are behaving. The Atlantic declined to share how many AI bots it was tracking.
Thompson said they’ll talk through a dashboard that shows how many site visitors came from AI platforms like Anthropic, ChatGPT or DeepSeek — as well as how many subscribers. While that latter number is “very small,” it helps The Atlantic decide which AI bots to block.
For now, the publisher has kept its parameters broad when it comes to how much traffic an AI bot needs to send back. “We don’t have a specific threshold,” Thompson said. “But it’s somewhere between zero and a lot. There are AI companies that drive essentially zero traffic or maybe one subscriber. We’ll definitely block them. If they drove 1,000 subscribers? Well, that’s different. Each subscriber pays $80, that’s $80,000 worth of revenue.”
The Atlantic has seen traffic driven by AI bots from tech companies like Google, Apple, DuckDuckGo, Bing, ChatGPT, Amazon, Perplexity, Facebook, ProRata and Mistral.
“Most of them give some value — some small value — and so you just have to trade off [the cost]. The cost is you’re helping them build a competitive system. You’re helping [AI engines] potentially out-compete you, and you’re losing all leverage to negotiate a deal with them. And you’re losing all leverage to enter litigation with them,” Thompson said.
Some publishers have taken the more heavy-handed approach and blocked most AI bots — though many are now reevaluating that strategy. TollBit CEO Toshi Paranghi has advised against blanket bot-blocking entirely, saying it incentivizes AI bots to evade detection.
Will Allen, vp of product at Cloudflare, defined the AI bot blocking process for its clients in three steps: audit, define and enforce. This varies by client, and depends on a publisher’s business priorities and deciding which AI crawlers create enough value to justify access, he said. Publishers can then express those preferences through robots.txt, and allow or disallow particular bots to access to their sites, he added.
Benjamin Fabre, co-founder and CEO at cybersecurity and bot blocking company DataDome, said AI traffic across 17,000 websites has increased four times from Q1 to Q3 2025, with some AI agents — such as Huawei’s — generating billions of requests per month without sending any traffic back.
Thompson said several, unnamed AI companies contacted him once he publicized The Atlantic’s new AI blocking analysis, but nothing came out of those discussions.
The Google challenge
One big headache for publishers is their inability to block Google’s AI crawler without worrying about what it will do to their search traffic. While they are separate bots (Google’s search crawler is called Googlebot and its AI crawler is called Google-Extended), if a publisher blocks Google-Extended, their content can still show up in Google’s AI generated summaries AI Overviews, because those are tied to Google Search. Because AI Overviews are bundled with the core search crawler, publishers can’t opt out of Google’s AI crawler without impacting their search traffic.
The Atlantic is planning to add Cloudflare’s new Content Signals Policy to its robots.txt file, which gives publishers a way to communicate how they do and don’t want AI crawlers like Google’s to use their content once it’s scraped.
But it doesn’t guarantee Google’s compliance, or any enforcement, with what the tool communicates: scrape our sites to index our pages for search, but don’t use our content for training your AI systems. The Atlantic plans to add this instruction to Google’s crawlers in its robots.txt.
Thompson concedes that Google might not comply. If not, it could give publishers like The Atlantic more leverage in future potential lawsuits against AI and tech companies, he added.
“My view is that we should set up our site in a way that explains very clearly how we want to be treated and how we want our content to be traded, and how we want to negotiate,” he said.
Allen told Digiday last month that millions of sites on Cloudflare have already implemented the Content Signals Policy tool. When asked if Google was abiding by the request from publishers, Allen said it was “early stages” and that it wasn’t possible yet to monitor Google’s compliance.
“Until Google really wants to do it, we have no way to really prevent it,” Fabre said.
More in Media
Media Briefing: Tough market, but Q4 lifts publishers’ hopes for 2026
Publishers report stronger-than-expected Q4 ad spending, with many seeing year-over-year gains.
Rethinking entry-level hiring in the age of AI: A conversation with Amazon’s Diana Godwin
Godwin, general manager of AWS Certifications at Amazon Web Services, has some insight on how certifications are bridging the skills gap.
WTF are synthetic audiences?
Publishers and brands are using AI to create a copy of audience behavior patterns to conduct market research faster and cheaper.