SHAPING WHAT’S NEXT IN MEDIA

Last chance to save on Digiday Publishing Summit passes is February 9

SECURE YOUR SEAT

WTF is the IAB’s AI Accountability for Publishers Act (and what happens next)?

This article is a WTF explainer, in which we break down media and marketing’s most confusing terms. More from the series →

This week, the Interactive Advertising Bureau proposed new protections to stop AI bots from freely harvesting online content. 

The trade body’s president and CEO, David Cohen, revealed its legal framework – the AI Accountability for Publishers Act at the IAB’s Annual Leadership Meeting (ALM) in Palm Springs on February 2. 

The legislation seeks to hold AI companies to account for illicit scraping and failing to comply with publishers’ “no crawling” specifications in their robots.txt files, which are notoriously difficult to enforce — non compliance poses pretty devastating consequences for publishers. 

Tollbit’s latest report, out this week, highlights the sprawling ecosystem of third-party web scrapers that has sprung up – built to feed AI and enterprise developers, often operating around paywalls, bypassing web controls and fueling legal and technical conflicts. Tollbit has documented nearly 40 APIs servicing this industry. 

They all flout robots.txt. 

Naturally, the IAB is a trade body, not a regulator, so it can’t force AI companies to comply, but its Act could raise the legal and reputational risks of ignoring publisher restrictions. 

“Ultimately, we believe this to be an absolutely critical issue to the future of the free and open internet, which is funded by an ad-supported model, largely. No industry can survive by giving away its goods for free,” said Michael Hahn, evp and general counsel, at the IAB and IAB Tech Lab. 

Hahn cited recent industrywide increases in AI bot traffic to publishers’ sites, and decreases in human traffic as more people use AI search tools to find information, leading to fewer clicks to websites. “When you have that dynamic, it will lead to a terrible place for publishers. We don’t want it to become like the desert that exists for local media,” Hahn said.

What does the Act mean by “unjust enrichment”?

AI companies have argued in copyright lawsuits brought against them by publishers that they are not committing any wrongdoing by scraping publishers’ content because the “fair use” doctrine allows the use of copyrighted material to make something new that doesn’t compete with the original work (this has been the defense given by OpenAI and Microsoft in The New York Times’ lawsuit against those companies, for example).

But this Act supersedes the usual U.S. copyright law and any state-level intellectual property laws, because it’s based on U.S. common law. One of the basic claims under common law is “unjust enrichment,” which is when one party takes something from another party without compensation and profits from it, Hahn said. What’s unique about the “unjust enrichment” claim is that there is no “fair use” defense, he added. 

So what exactly is the Act aiming to clamp down on? 

Here’s a boiled-down version of what the Act proposes:

People can take bot operators to federal court if they find they are scraping information from their sites without paying them. If they do decide to sue, they can recover the value of the content or any profits lost because traffic was diverted. 

They can:

  • Take any profits the bot operator made from using the content 
  • Get a court order to stop the unauthorized use
  • Recover legal costs

Violators could face triple the damages, meaning they’d pay three times the losses they caused, if they: 

  • Fail to accurately disclose the identity, nature, purpose and operational scope of its bots to the digital property
  • Conflates the function of a search bot with that of a scraping bot, or vice versa 
  • Fail to comply with robots.txt

‘Conflates the function of a search bot with a scraping bot’ – that seems aimed at Google? 

Seeing as it’s the only company to do this currently, that’s a safe assumption. U.K. regulator, the Competition Markets Authority, made it very clear last week with its proposal that Google doesn’t provide any level of useful controls for publishers that differentiate between indexing for discovery and indexing for use (for training models or generating summaries for instance.) 

Running parallel is the European Commission’s AI antitrust investigation into whether Google is unfairly using web publisher content and YouTube videos to train models without creators’ consent. Then the IAB comes out with proposed legislation, which includes a line that has penalties for any company that conflates a search crawler with an AI bot scraper. “There’s obviously one company that does that at a massive scale,” said one publishing exec who agreed to speak on the topic in exchange for anonymity. The fact there are now two regulators and now the U.S.’s largest industry trade body all saying the same thing makes Google’s ongoing claims that it’s fair use “start to ring hollow,” said the same executive. 

So, what happens next?

The IAB has sent the draft legislation to Senate staff and key members in Congress, according to Hahn, and is currently in the process of setting up time with lawmakers to explain the issue, and how this bill could solve it. The next step is to find a member of Congress to sponsor the bill, and “educate and gain traction and support around [the draft bill],” Hahn said.

“Our view of it is, let’s not try to have a legislative solution once a full-on crisis exists, let’s be thoughtful about this ahead of time,” he added. “We can’t wait five years for all these different courts to probably come out different ways, and for them to go up to appellate courts and eventually to have the Supreme Court rule on an issue, because by then, the damage will be done. The model will be broken.” 

What is this going to look like in practice? How will they get AI companies to comply?

If this bill does become a law, the legal burden still lies on publishers. They would need to prove that AI companies are scraping their sites without compensation (by tracking AI bot traffic) and then sue AI companies for this. But, they would have another law as the foundation of their claims, as opposed to copyright law, which has that fair use defense, Hahn explained. 

There is a provision in the draft bill that calls for “treble damages,” which Hahn described as the primary “disincentive” to AI companies continuing with their scraping behavior. If AI companies are found guilty of scraping publishers’ content without paying them, and damages are proved in court, those companies would have to pay three times the amount of those damages.

What’s the likelihood of this getting approved swiftly at the federal level?

The ideal scenario is for a bill like this to go through various committees, then a general vote in the House of Representatives, and then the Senate, and if it’s approved, it would await the President’s signature.

“The problem is there are thousands of pieces of legislation put to Congress every year and very few of them make it all the way to the President’s desk,” said Aaron G. Rubin, partner in the strategic transactions and licensing group for law firm Gunderson Dettmer. “That’s only magnified by the fact that the U.S. Congress and its relationship with the President isn’t totally functional at the moment. You combine that with the clear intent for there to be maybe less regulation of AI companies than more regulation at this current moment, and I think a bill like this would face significant headwinds in Congress.”

Paul Ragusa, IP partner at law firm Baker Botts, said that while this creates legal grounds to challenge web scraping for AI training data, the Act will likely face pushback from both the current administration and parts of the tech industry. “The administration’s America’s AI Action Plan aims to reduce regulation and lower barriers to domestic AI innovation,” he said. “They may see the Act as working against this plan by restricting access to certain training data. The tech industry has similar concerns, particularly because the Act would override the fair use doctrine, a defense the industry has leaned on heavily in copyright infringement cases.

Could there be other options? 

Yes, potentially the IAB could work at the state level. Currently, the legislation is worded in a way intended to push for it to be a federal law, noted Rubin. But there could be a path to go state by state if it came to it. That’s what’s happened with the California Consumer Privacy Act, passed in 2020, which had a huge impact and has led over 20 other states to follow suit with their own data privacy laws. 

But it’s not ideal. “The issue with doing it at a state level means it creates a patchwork of laws that make it difficult to comply with,” Rubin said. That said, the current proposal is relatively simple – get consent (from publishers and content creators) and for there to be a legally binding commitment to honor robots.txt files, which could make it easier for laws to be familiar enough across different states, added Rubin. 

Can the IAB enforce the fines if the bill doesn’t become law? 

No, the IAB doesn’t have government power to make or enforce laws. But its position as a self-regulatory group, where companies agree to follow shared rules, does mean its frameworks can have real weight. 

Rubin pointed to the Digital Advertising Alliance as an example – a framework that many ad and ad tech companies have signed to. It’s voluntary, but there are consequences where groups like the Association of National Advertisers can step in and companies risk public scrutiny if they don’t comply. Plus, the FTC can step in at the government level if a company’s behavior is unfair or deceptive, he added. 

The snag here is that it’s unclear, with a robots.txt-based bill, how compliance would be enforced since it applies to a broad set of AI developers, rather than a defined group that has agreed on shared rules. 

More in Media

football

Brands invest in creators for reach as celebs fill the Big Game spots

The Super Bowl is no longer just about day-of posts or prime-time commercials, but the expanding creator ecosystem surrounding it.

Media Briefing: A solid Q4 gives publishers breathing room as they build revenue beyond search

Q4 gave publishers a win — but as ad dollars return, AI-driven discovery shifts mean growth in 2026 will hinge on relevance, not reach.

Bloomberg’s new video hub aims to keep audiences – and subscribers – on its own turf

Bloomberg launched a centralized video hub to improve discovery, boost engagement and keep audiences (and subscribers) on its own platform.