12 passes left to attend the Digiday Publishing Summit
WTF is AI ‘grounding’ licensing, and why do publishers say it matters over training deals?

This article is a WTF explainer, in which we break down media and marketing’s most confusing terms. More from the series →
AI training licensing deals are starting to feel like yesterday’s news as publishers and platforms focus on more dynamic, usage-based models.
Rather than the initial training deals that formed the backbone of AI licensing partnerships between AI platforms and news publishers, recent deals have forged around different parameters: what many in the industry refer to as “AI grounding.”
In fast-moving digital areas like AI, the terminology tends to splinter quickly. Vendors, publishers, platforms and analysts coin their own terms: for instance, “grounding,” “content inference compute”, and “retrieval augmented generation” (RAG) are all intertwined and refer more or less to the same thing. Those who can’t be bothered with jargon of any sort simply call grounding and RAG “web search.”
To AI engineers, there are subtle differences between them, but for publishers, RAG/grounding has changed how they get paid now given how the large language models (LLMs) now process information.
One-time lump sum payments are out; recurring, usage-based licensing agreements are in. “As we’ve moved more into RAG deals, the per-usage aspect of these pricing structures has become the preeminent piece of the pie when it comes to fees,” said Aaron G. Rubin, partner in the strategic transactions and licensing group for law firm Gunderson Dettmer.
Here’s a primer.
What is the difference between training versus grounding deals?
In a nutshell, payment terms of grounding or “RAG” deals are based on how AI systems fetch live content from publishers in real time. If a person searches for an update on some recent news like, “Show me an update on the meeting between Trump and Zelensky,” which happened over the last week, AI engines won’t have that stored in their training. “Training windows for AI engines tend to be up to six months old; they don’t know anything after the training date,” said Martin Alderson, co-founder of web performance consultancy Catch Metrics. That’s why they use RAG to pull the information from a multitude of publishers to provide the best response to the user.
That model should create opportunities for recurring licensing revenue, attributions and continued visibility. In contrast, training deals are typically one-time payments where publishers get an upfront lump sum, or have a fixed fee over years for content used to train a model. The New York Times agreed to a training deal with Amazon, to the tune of $20 million, while News Corp did similar for $50 million. Many of the agreements from the first wave of publisher-AI platform deals would have been for training.
Why is focus shifting to so-called grounding or RAG deals?
For starters, few publishers would have been able to negotiate to the same level as the NYT and News Corp. But also because the value of training data has receded for AI platforms. For publishers like DPG Media, training deals don’t warrant decent payouts, stressed Valerie de Naeyer, head of Gen AI transformation and operational excellence at DPG Media. “In terms of copyright law, publishers are not so keen on licensing content to train the model either — lots of questions on IP remain unresolved,” she said. “It’s possible that there is also a training component in some deals, in case of historic or less relevant content, but in case of real-time, content grounding is preferred,” she added.
On July 30, Gannett signed a licensing deal with Perplexity to allow it to license content from USA Today and the USA Network. As always, details on payment terms are scarce, but it’s an example of a RAG/grounding deal due to Perplexity’s approach, which centers on ad revenue sharing, not training content deals.
“Gannett has joined Perplexity’s Publisher Program, which incorporates Retrieval Augmented Generation (RAG) as it relates to our trusted content being included as part of answers to Perplexity users question[s] through their consumer offerings,” confirmed a Gannett spokesperson in an email statement.
So if it’s not a flat fee, what is payment based on?
The umbrella term is usage-based payment structures. There are a plethora of examples already and which exact type of payment that will be agreed upon will vary depending on the AI company involved. Some examples are: pay per usage, pay per query, pay per crawl, and those based on ad revenue sharing, like Perplexity and Prorata.ai provide, which remunerate publishers when their content is used within RAG. The IAB Tech Lab is working with publishers and cloud edge companies to develop both pay-per-crawl and pay-per-query models for its standardized framework.
From a licensing standpoint, the key question is whether content is actually surfaced in the output — cited, attributed, and linked back. That’s what defines a RAG-style deal, stressed Rubin. In contrast, traditional training deals involve feeding content into a model so it can learn from it at scale, but without necessarily reflecting that specific content in the output, he added.
“I think a lot of these licensing deals have moved to…the grounding side of things, where if I want to cite and use News Corp articles in my output and link to them, I need to license that from them if I’m a tech company,” he said. “And so I think that’s another reason why we’re seeing these grounding deals become more prominent in the recent past, and going forward.”
Is there a preferred type of usage deal yet?
Too early to say. Deals will depend on the negotiating strength of each party, stressed Gary Kibel, partner at law firm Davis+Gilbert. “Both sides are learning and becoming more sophisticated in these deals,” he said. “Maybe publishers are starting to realize what additional controls they should push for in the agreements, and the AI platforms are starting to learn about maybe additional permitted uses they want to get into the agreement, he added.
A 2025 AI licensing deal already looks different from a 2024 one, thanks to lessons learned — and by 2026, deals will likely evolve again as new applications for content emerge, said Kibel.
“There is no one-size-fits-all with finance,” he added.
But this evolution in the payment terms seems ultimately better for publishers, right?
Right. When the earliest version of ChatGPT first burst on the scene in November 2022, the picture looked very different. Publishers had a common fear: that the LLMs had stripped all their content. The models were built. It was game over. So it was, in a sense, a period of damage control on their part. “People negotiated deals and made some money, but none of the deals seemed particularly great, and they were all one-offs,” said Paul Bannister, chief strategy officer at Raptive. So it’s like, say you got a check for $20 million, that’s great, but it’s not going to save your business five years from now.”
For now, it’s all about usage. Publishers are reporting a surge in crawls, with the same piece of content sometimes scraped thousands of times a day by AI systems, stressed Bannister. The spike is tied to RAG and grounding techniques, which trigger fresh pulls of the same content for each new type of query. So, sure, there may be ways AI companies get more efficient at that in time, and a single pull will suffice, but for now, there is value in that for publishers, if they have a deal based on pay per crawl, for example.
“I do hear a lot from publishers these days that the type of training deals publishers were doing a year ago are not going to renew,” added Bannister. “Everyone is talking more and more about grounding being the right thing, and probably because, to some level, there is an easier business model behind it.”
More in Media

Inside Best Buy’s new third-party marketplace
It broadens Best Buy’s lineup of technology products, adding custom controllers, gaming chairs and desks, keyboards and mice, monitors and headsets.

Media Briefing: Publishers catch new vibes from Meta on AI licensing
Publishers are picking up new vibes from Meta, which they believe signal that the platform may be changing its stance on AI licensing.

CTV looks to invest in creator content to win over more ad dollars
For now, the overwhelming majority of CTV creator channels simply repurpose content that is also available on other platforms such as YouTube. FAST channel operators using pre-existing creator content as their inventory need to convince advertisers of the value of showing up alongside creators on connected TVs, rather than the already-popular social platforms that have long been part of brands’ marketing mixes.