Last chance to save on Digiday Publishing Summit passes is February 9
Cloudflare’s Human Native acquisition signals a new content economy for publishers
As a Digiday+ member, you were able to access this article early through the Digiday+ Story Preview email. See other exclusives or manage your account.This article was provided as an exclusive preview for Digiday+ members, who were able to access it early. Check out the other features included with Digiday+ to help you stay ahead
Cloudflare’s move to bring AI startup Human Native into its stack signals a turning point: licensed, structured content could become a foundation for a more sustainable AI economy.
While the ink is pretty fresh on the acquisition, announced on Jan. 15, several media experts and publishers regard it as a signal for how Cloudflare plans to help build an infrastructure for the AI content economy.
And Human Native’s platform addresses a vital part of the AI compensation struggle publishers have had to date: incentives for AI developers to opt in.
What Cloudflare is really trying to build with Human Native
Cloudflare is effectively building an AI licensing stack for its publisher clients. U.K.-based Human Native helps turn publisher content into AI-ready data and makes sure the people who created it get paid.
That’s a route Cloudflare has already explored, having kicked off a private beta for a new kind of web index, called AI Index, designed to help creators make their content accessible to AI by giving AI developers higher-quality data and creators fair compensation. The tool, announced last September, has seen promising potential, Cloudflare’s vp of publisher products Will Allen said, though he wouldn’t reveal specifics.
Folding in Human Native’s team and platform fortifies that capability under a common mission, stressed Allen. “We need lots of collaborators, and it means really pushing things forward with better control for publishers, better control for content creators, and better content — better data — for AI companies,” he told Digiday.
This is the latest in a string of moves Cloudflare has made to re-address the imbalance between publishers and AI companies that had ripped off their content for free to train their LLMs.
Last year, a flurry of new products from Cloudflare, including bot-blocking by default, Content Signals Policy, pay-per-crawl tools and AI Index, signalled the direction it is taking in building an AI-friendly infrastructure that helps publishers monetize content, control access, and ensure fair compensation when their work is used by AI developers.
But blocking AI crawlers alone isn’t enough for publishers — to turn content into a sustainable revenue stream, AI developers need real incentives to opt in and pay for access.
How would this incentivize AI companies to opt in?
AI developers can’t rely on scraping forever: without licensed content, models risk low-quality training, regulatory backlash, and strained relationships with the very publishers and creators whose work powers their products. To date, it’s been hard to incentivize the bulk of AI developers to pay, bar the biggest ones (OpenAI, and more recently, Meta) and they have deeper incentives than ethics: legal risk mitigation.
“The web is messy, and there’s a lot of unstructured, unlabeled content out there that is being thrown into the training of these models and effectively just churned around until something useful comes out,” said James Smith, co-founder of Human Native. “You can save a lot of time and effort and achieve superior results if you put in better, more structured data.”
That led the Human Native team to start thinking about what their challenges were, what could bring them to the table, instead of ransacking the free buffet of content on the web. Ethics and legality aside, it’s ultimately not good for their own products to do so, he stressed.
Smith pointed to a client, whom he wouldn’t name, but described as a U.K.-based AI startup. This AI company, like most, had gobbled up all the videos available on the internet to train its models. Human Native started to provide them with high-quality data from U.K. video production companies that worked on Hollywood movies featuring great talent. The result was that the AI model was able to ingest a quality and depth of data and metadata, organized and structured to a level it hadn’t experienced before, per Smith.
And how well does the publisher get paid?
This particular video production company usually operates on a project-by-project basis, often running on tight budgets. Typically, these companies hire large crews for a single major project, like a Hollywood production, then the staff move on to other short-term gigs to fill their schedules. But working with the AI developer for AI royalties meant that the studio was able to keep its facility active during gaps between major projects and provide consistent employment for staff. Plus, the work was structured via contracts that mirrored how movies handle royalties: all artists involved earned royalty-style payments whenever the resulting data sets were used for AI training, according to Smith.
“I think that gives you a glimpse into what the future can be here, where everybody benefits, where the AI companies get something better, and the creators get something in return for their hard work,” he said.
The production company got an upfront payment, then a bonus payment linked to revenue targets for that AI company, though Smith wouldn’t reveal specific numbers.
Smith said the team has since learned that it can be more aggressive with payment terms for publishers, having seen how, in its earliest deals, the AI companies smashed through those revenue targets pretty fast, meaning it was under 12 months before they hit the bonus targets and were able to then provide the second tranche of payments to the creators. “If I were doing those deals today, I would set up a more aggressive revenue target bonus payment structure because I do think AI companies are growing incredibly quickly,” he added.
Can it be enough to incentivize the biggest LLM players?
Time will tell. “I think they [Human Native] have carved up that niche for the smaller LLMs, and people that want good data and want to be ethical about it or don’t have the teams or the money to black hat their way into that content,” said Scott Messer, principal and founder of publisher consultancy Messer Media.
“We still need the business mechanisms for doing things legally, you can’t just keep shouting ‘that’s illegal, I don’t want you doing that’ – which is what we’re currently doing with LLMs – we’re suing them and blocking them.” A marketplace like Cloudflare’s with Human Native could help solve for that, he added.
The end of the AI scraping Wild West — or just a new gatekeeper?
Let’s not get carried away. There are plenty of reasons to like an acquisition, but at its core, it’s just good business, stressed David Buttle, founder of media consultancy DJS Strategies and former platform strategy chief at the Financial Times. Publishers may fear a Cloudflare monopoly, but controlling 21 percent of sites falls short of Google-level dominance — though it’s still a significant slice.
Buttle sees the acquisition as a tactical move to improve Cloudflare’s CDN solution and expand its customer base, rather than a significant market play. “Their solution is vendor locked, so you can’t access their marketplace if you’re not on Cloudflare.”
But that lack of monopoly means there is little risk for publishers. “If it establishes norms that intellectual property is paid for when it’s being developed and deployed by AI applications, then it’s only a positive thing.” He added that the AI content industry is still in its early chaotic days, reminiscent of the ad tech boom publishers faced in the early 2000s. “We still need to make the market. The market isn’t really there at the moment.”
‘Leaky’ content distribution could cause LLM workarounds
Even with safeguards in place, some publishers worry that even with better AI licensing and protections, content will still leak across the web, ending up repurposed or appearing on long-tail sites — a problem they’ve wrestled with for years, whether from MFA schemes or low-quality copies. That long history of content leakage also makes it easier for LLMs to absorb and reuse publisher content without ever compensating the original source.
“Content discovery and distribution is very leaky,” said Tom Bowman, media consultant and former svp of revenue operations for BBC Studios. “Some original publishers are sometimes complicit in allowing that to happen, and in other instances, they’re very unhappy about it. The danger is that it’s kind of an all-or-nothing — publishers have to do this, because if a few of them do it, then people [LLMs] might go around them.”
More in Media
Ad Tech Briefing: The Trade Desk’s CFO search indicates a tough road ahead for independents
Muted expectations expected in the year ahead, as budget-controllers prefer accountability over reach.
TikTok’s ownership shakeup sends creators scrambling amid chaos and uncertainty
TikTok ownership changing hands over the weekend led to widespread problems on the platform that has left creators reeling.
TikTok moderation has pushed some news creators to the limit
TikTok news creators like Taylor Lorenz and Aaron Parnas on the platform’s moderation and engagement woes, and how the new U.S. ownership could further affect that.