Five seats left to attend the Digiday Media Buying Summit:

Join us Oct. 15-17 in Phoenix to connect with top media buyers

SECURE YOUR SEAT

From walls to frameworks: Publishers and tech giants push weekly talks on AI content use

Amazon, Meta, Microsoft and Google all had seats at the table alongside 35 publishers at the IAB Tech Lab’s LLM working group in NYC last Thursday. The clearest action point: the initiative has shifted to weekly meetings as it races to find standards on how AI uses and pays for content. 

More than 70 companies gathered for the workshop, roughly half of which were publishers — a handful from Europe. The rest were a mix of big tech representing their respective LLMs, tech vendors and cloud edge companies Cloudflare and Fastly, who are now taking a far more active role in helping publishers block unauthorized bots, shifting from background tech enablers to vocal gatekeepers in the AI era. 

It’s not surprising that Amazon, Google, Microsoft and Meta were present at the workshop — as major advertising players with long-established ties to both the IAB and publishers, they’re already part of the ecosystem. OpenAI and Perplexity were still no-shows, though. 

IAB Tech Lab CEO Anthony Katsur said he was pleased with the publisher turnout but wants an even higher ratio of publisher representation in the group. 

The kick-off meeting for the working group, on July 23, skewed heavily toward ad tech representation, leaving some publishers Digiday spoke to a little wary. “This needs to be a conversation between the publishers and the LLMs, not with ad tech vendors in the middle,” said a publishing exec whose company attends the group, and agreed to speak on background, adding that if ad tech became too dominant, it would be problematic for publishers in the group. 

Katsur agreed that the former meeting was too ad tech-heavy, but said last week’s meeting didn’t come off that way. “I think that we definitely flipped the script a bit, but I’d love to see even more publishers,” said Katsur. 

Digital advertising is full of cautionary tales of too many cooks in the standards kitchen. In the mid-2010s, multiple header bidding wrappers competed until Prebid became the de facto standard in 2019. Privacy regulations (GDPR) brought the IAB’s Transparency and Consent Framework into play circa 2018-20, which many publishers struggled to implement. And video specs have been messy and fragmented for more than a decade, with VAST introduced in 2008, VPAID layering on complexity in 2012 before being deprecated in 2019. The lesson: without alignment between publishers, tech vendors, and advertisers, new standards stall.

What got discussed 

To block or not to block remained a dominant topic of discussion during the hour-long workshop, with some publishers wanting to take a harder line on “getting the LLMs’ attention enough to cut a deal,” by blocking LLM crawlers’ ability to access their content for RAG purposes, according to Katsur. 

The other side of the coin is that many publishers are also wary of blocking without knowing the full downstream impact of doing so — largely being cut out of the deals conversation entirely. 

Katsur said that finding the solution for this publisher dilemma isn’t the purpose of the Content Monetization Protocols (CoMP) framework, but a decision for individual publishers. He said focus for forthcoming meetings should be on outlining next steps for an API framework that provides a set of interfaces by which LLMs can work with publishers, to establish some form of viable long-term economic model, stressed Katsur. 

Talking in detail on the economic models now is “cart before the horse,” though, he added. One priority: creating better content classification and structured data taxonomies, which makes it easier for AI systems and search engines to understand and correctly cite content, and ensures that when AI crawlers are allowed in, they can scrape publisher sites more efficiently. That kind of thing should be a light lift for publishers, added Katsur, and is where the framework should begin. 

Let’s face it, publishers risk tripping over their own feet if they can’t align on a pragmatic framework. Some are blocking AI crawlers, others are cutting deals, a few are suing – and the lack of a unified approach may leave the industry splintered at a moment when coherence could mean real leverage. 

Katsur believes that figuring out a mutually beneficial set of APIs that benefits publishers first and LLMs second will also help attract the LLMs to buy in further. “The Wild West of crawling only scales so much,” he added. 

Messser Media CEO and working group member Scott Messer underlined the pressing need for alignment among publishers and their approach to working with AI outfits, as a too-fragmented approach could prove costly.  

“Publishers need a common set of language [to communicate with LLMs], understand the problem, understand the vendors, and talk about what solutions work for us, and work for the LLMs,” he added. “Because, if we don’t convince them, then it’s not valid.”

Publishers want standards around AI crawler identification 

This is hardly a U.S.-only debate — it’s a global issue for publishers. Katsur carried the LLM protocols conversation to Germany in September, where he joined another international working group to push the conversation forward. At Dmexco in Cologne, he spoke to around 20 German publishing houses and agencies about the LLM framework at an event organized by the German Federal Association of the Digital Economy (BVDW). “The Dmexco meeting was packed with pretty much every major German publisher. The topics were pretty much the same [as the NY] one, but the publishers were really leaning in, and very vocal,” said Katsur. 

The discussion mirrored pretty much everything the NY one did, he added. The IAB will run a London-based LLM framework-led discussion at a round table dinner for U.K. publishers at the IAB Tech Lab’s International Summit in early November. 

IAB Tech Lab isn’t the only body circling how to define standards for how LLMs use publisher content and compensate them for it. The Internet Engineering Task Force (IETF) is working on machine-readable standards for bot identification and permissions, while the World Wide Web (W3C) consortium is looking at standards for labelling, watermarking and verifying digital content origins — important for publishers who want to mark their content for or against AI usage. 

Meanwhile, infrastructure company Cloudflare rolled out tooling that lets publishers update their robots.txt files in bulk across domains with explicit rules for AI crawlers. That followed just a week after the announcement of Really Simple Licensing, an open standard that lets publishers define machine-readable licensing terms for their content, including attribution, pay per crawl and pay per inference. The initiative has the backing of companies including People Inc., Condé Nast, Ziff Davis, Reddit, Fastly and Yahoo. 

Katsur said the IAB Tech Lab’s framework is more focused on the full spectrum of access models for LLMs, than RSL, which is more of a licensing language, but that there is potential for them to dovetail. “We’re [about] access; tiering of content; determining what the commercials are: Is it pay per crawl, is it all you can eat? Is it pay per consumer search query result,” he said. For example, archival content might be offered broadly, while newer or premium stories could be licensed on a per-result basis. The key questions are how access is locked, tracked, and monetized across that lifecycle. 

As for publishers, they don’t really care who wins the LLM standards race. This is a global, market-wide issue which needs a neutral framework where publishers, retailers and big tech all sit at the same table, stressed Stefan Betzold, chief product marketing officer at Bauer Media Group, who attended the Dmexco meeting. “As publishers, we have to support these working groups. There needs to be a standardized protocol for bot and agent identification and management. Whether it comes from the IAB or another neutral body isn’t the point — what matters is that it does not come from a single vendor,’ he said. “We need clear, purpose-driven identification of crawlers to manage them securely and responsibly in the future, ” he added. 

Bertelsmann, a globally operating media, services and education company including the RTL Group, publisher Penguin Random House, and the music company BMG, had execs at the NYC July kick-off workshop and at the Dmexco meeting. “Standards are the most effective way to scale solutions without reinventing the wheel each time,” said Achim Schlosser, vp of Global Data Standards at Bertelsmann, who attended both. “No one has all the answers. What matters is staying open-minded, engaging where it counts, and continuously adapting your offerings to the new ecosystems that are emerging.” 

Ronan Shields contributed reporting.

More in Media

Mitigating ‘Google risk’: The Independent maps four-pillar growth plan for the AI era

The Independent has built its growth strategy around the “blue links risk” and has stopped measuring its success by audience reach.

Advertising Week Briefing: Creators emerge as the industry’s new power brokers

Advertising Week has had creator-focused content tracks in past years, but the rising presence of content creators at this year’s event represents an evolution in how creators are engaging with advertisers, both at industry conferences like Advertising Week and in general.

Illustration of a robot talking to a person.

Cisco unveils framework to transform workforce collaboration with AI agents

The tech giant’s Connected Intelligence framework represents a fundamental shift in how employees will work — not just with each other but alongside AI agents.