SHAPING WHAT’S NEXT IN MEDIA

Last chance to save on Digiday Publishing Summit passes is February 9

SECURE YOUR SEAT

The Rundown: Google has drawn its AI payment lines — and publishers’ leverage is narrow

As a Digiday+ member, you were able to access this article early through the Digiday+ Story Preview email. See other exclusives or manage your account.This article was provided as an exclusive preview for Digiday+ members, who were able to access it early. Check out the other features included with Digiday+ to help you stay ahead

Google’s testimony to U.K. lawmakers this week did more than restate familiar arguments about fair use and training. It clarified the boundaries of what the company believes it should, and should not, pay publishers for in the AI-driven search ecosystem.

For publishers trying to navigate AI licensing, the message was blunt: Google is willing to pay for access, but not for training — and it remains unwilling to define AI Overviews as a compensable use of journalism.

That distinction matters, because it effectively reinforces where publishers’ potential to get paid may still exist, where AI licensing deals are still possible — and where the fight is effectively over.

Naturally, publishers don’t like it.

“Google’s position on willingness is mostly immaterial,” said Danielle Coffey, president and CEO of News/Media Alliance. “Google and most other Big Tech giants have disregarded copyright law as though it doesn’t apply to them for years now. But this is a question of legality, not preference. If Google is legally required to pay for training materials, as it should be, it must comply with the law,” she said.

Digiday spoke with several media execs and policy law experts — some of whom agreed to speak in exchange for anonymity — for this article.

Paying for training is off the table

This isn’t a blanket “we won’t pay” publishers for content used for AI purposes ever, but in her appearance before the U.K parliament, Roxanne Carter, Google’s head of public policy for copyright, made clear that the company does not believe it needs to license unpaywalled content for AI training.  

She claimed its reasoning is that training LLMs on open-web content is a process of statistical analysis rather than copying or retrieving information, therefore it won’t pay. That’s a hard line to publishers — that there is no future for negotiation here.

However, this seems like Google is greeting what publishers perceive as a “false distinction” between the use of data at the training stage versus use of copyright material in outputs, as one exec put it, who agreed to speak candidly in exchange for anonymity.

The concern is that if AI companies are given a special exemption that lets them use copyrighted material for training without permission, it would encourage them to rely on other people’s work without credit, links, or payment.

Access, not learning, is where Google will pay

Carter made it clear, however, that Google is willing to open its wallet on controlled access, which could be archive content, off-platform datasets, APIs, or other work that has been opted out of AI training. 

Google is effectively trying to separate AI answers into two categories and argue that only one of them should ever involve payment to publishers when a model is grounded in real time via retrieval augmented generation (RAG). The difference comes down to whether the AI response clearly points back to a publisher.

It calls these non-expressive (no payment) and expressive outputs (potentially payment). Non-expressive outputs are AI answers that don’t contain a brand, link or attribution to a source therefore do not attract copyright, in Google’s view. Expressive outputs do contain a brand, link or attribution, and may be subject to copyright. 

In this theoretical context, Google can gain commercial benefits from the use of original journalism to build a model capable of generating non-expressive outputs from its model, according to an exec who spoke with Digiday on condition of anonymity. 

The fact that they decide not to attribute those outputs to the sources of that training data in the output is irrelevant to the value extracted, they stressed. And to give Google an exception to allow them to suggest that the use of material for training should be exempt from copyright would encourage the use of third-party work to train and fine-tune models without attribution, links back to sources, or any form of payment back to the creator of that content. 

AI Overviews and remain the unresolved fault line

When pressed specifically about AI Overviews Carter did not provide a clear answer on whether publishers can genuinely opt out of AI Overviews while still benefiting from search. 

That distinction is crucial because AI Overviews has dramatically reduced click-through traffic to publishers, and the existing opt-out controls (Google Extended) that cover training bots don’t necessarily stop content from being used in AI Overviews.

“The tools that Google does offer to publishers to opt out are deliberately weak,” said Coffey.

The crux of the issue is the fact that the only way to opt out of AI overviews is either to opt out of Google search altogether, or to remain in search, but then to NOSNIPPET (basically a “do not summarize” sign for search engines) each individual article. But implementing NOSNIPPET has a “serious sting” in its tail, one publisher stressed.

In 2019, Google’s president of Global Affairs, Kent Walker, warned that by displaying only the title, URL, and thumbnails, NOSNIPPET would lead to a 45 percent reduction in clickthrough traffic, making this an unacceptable option for publishers.

The reality is, separate crawlers may be the only thing that gives publishers confidence. Carter pointed to ongoing discussions with the U.K.’s Competition and Markets Authority — which is investigating Google’s use of content for AI search — rather than directly addressing the question.

That is widely interpreted outside the hearing as not giving publishers a clear way to avoid their content being used for AI summaries, while still being findable in Google search results.

Paul Bannister, chief strategy officer at Raptive, stresses that Google’s messaging has always been purposefully opaque around this ability to opt out. “They’ve always purposefully kept AI Overviews data provenance/opt-out murky to take advantage of their monopoly position in search to effectively force publishers not to block them,” he said. 

Bannister believes that if publishers were to broadly block other AI companies from accessing their content, it would expose how concentrated Google’s power really is, potentially creating enough pressure to push Google toward concessions. “But that’s a long game and not likely to happen anytime soon. So for the foreseeable future, Google will continue to steal publishers’ content and keep putting out murky messaging to confuse things,” he said. 

More in Media

Q&A: Nikhil Kolar, vp Microsoft AI scales its ‘click-to-sign’ publisher AI content marketplace

What started with a limited group of publishers and Copilot as the first customer is now evolving into a more scalable model, with Microsoft testing how pricing, access and compensation might work as usage grows. 

A running list of publisher lawsuits targeting Google’s ad tech practices

Digiday has compiled a running list of publishers’ lawsuits against Google for its ad tech practices, seeking compensation for claimed lost revenue.

From vibes to data: Why some brands use predictive tech to vet creators

Brands like TheRealReal and Shark Ninja are turning to predictive models and datasets to guide their strategies.