WTF is OpenAI’s GPTBot?

This article is a WTF explainer, in which we break down media and marketing’s most confusing terms. More from the series →

Publishers have a new tool in their efforts to limit AI’s threat to their businesses. And it’s from the company behind one of the predominant threats.

In August, OpenAI announced that website owners can now block its GPTBot web crawler from accessing their webpages’ content. Since then, 12% of the 1000 most-visited sites online have done so, according to Originality AI. The list of sites shutting themselves off to OpenAI’s web crawlers includes publishers such as Bloomberg, CNN and The New York Times.

As Digiday has covered, publishers have had a hard time protecting against generative AI tools like ChatGPT sidestepping their paywalls and siphoning their content to inform the large language models. OpenAI’s announcement, however, makes that undertaking much easier.

For those unfamiliar with what a web crawler like OpenAI’s GPTBot is, not to mention how websites are able block their access, check out the explainer video skit below.

https://digiday.com/?p=517231

More in Media

AI Briefing: How political startups are helping small political campaigns scale content and ads with AI

With about 100 days until Election Day, politically focused startups see AI as a way to help national and local candidates quickly react to unexpected change. 

Media Briefing: Publishers reassess Privacy Sandbox plans following Google’s cookie deprecation reversal  

Google’s announcement on Monday to reverse its plans to fully deprecate third-party cookies from its Chrome browser seems to have, in turn, reversed some publishers’ stances on the Privacy Sandbox. 

Why Google’s cookie deprecation reversal isn’t actually a reprieve for publishers

Publishers are keeping a “business as usual” approach to testing cookieless alternatives despite Google’s announcement that it won’t be fully deprecating third-party cookies after all.