LIMITED SPOTS LEFT:

Join us at the Digiday Publishing Summit from March 24-26 in Vail

VIEW EVENT

WTF is ‘shadow AI,’ and why should publishers care?

As a Digiday+ member, you were able to access this article early through the Digiday+ Story Preview email. See other exclusives or manage your account.This article was provided as an exclusive preview for Digiday+ members, who were able to access it early. Check out the other features included with Digiday+ to help you stay ahead

If I paid for ChatGPT Plus from my own pocket and used it to help me write this article, would my boss know — or care?

That’s the question surrounding “shadow AI” — which refers to the use of AI tools at work that haven’t been formally approved by companies.

The ongoing pressure to work faster, along with the proliferation of easy-to-use generative AI tools, means more editorial staff are using AI to complete tasks. And while using generative AI for minor tasks — like grammar checks, rewriting copy, or testing different headlines — falls into one bucket of offenses, there are other uses which could cause bigger problems down the line, if left unchecked.

And that could lead to inconsistent editorial standards, security vulnerabilities and ethical missteps, legal experts say.

Here’s a more detailed look at what it is, and how publishers are set up to deal with the risks. 

What is shadow AI?

Shadow AI refers to the use of AI tools at work that haven’t been officially approved or licensed by the company. It’s been a thorn in the side of IT departments everywhere, as the unsanctioned use of generative AI tools at work can make businesses more vulnerable to data breaches. 

But its application to newsrooms poses a unique set of considerations. Inputting sensitive data like sensitive source material, proprietary research, embargoed news and copyrighted information into large language models without a publisher’s oversight could more than risk the protection of that information and the journalist’s reputation and accuracy of their work — it could even be illegal.

“If somebody takes [my work] and puts it into a system, and now the owner of that system has a copy of it, they could potentially use it in ways that [I] never intended or would have ever permitted. And one of those ways is training AI,” said Michael Yang, senior director of AI advisory services at law firm Husch Blackwell. “You could be in a situation where you have inadvertently or unintentionally caused a breach of a contract situation.”

Newsroom employees inputting copyrighted data into an LLM that then uses that data for training purposes could cause legal issues down the road.

What are the risks involved?

Legal experts that spoke to Digiday cited three main considerations: the potential bias of AI models, confidentiality and accuracy issues.

The bias of AI models has been well-reported. If the data used to train AI models is biased or one-sided (such as skewed in favor of certain races or genders) and journalists depend on tools built from these models for their work, the output could end up perpetuating those stereotypes, Yang stressed.

LLMs scraping publishers’ online content and using it to train their models is at the heart of copyright infringement cases like the one brought by The New York Times against OpenAI. The same questions around how these LLMs are using data to train their systems are why it could be risky for journalists to input copyrighted data (or any sensitive information — such as confidential source information) into an AI model that is connected to the internet and not hosted locally, according to Felix Simon, a research fellow in AI and news at Oxford University who studies the implications of AI for journalism.

Sensitive data could be fed into these unapproved systems and used for training the AI models — potentially appearing in outputs, Simon said. And if these systems aren’t secure, they could be viewed by the AI tech company, the people reviewing model outputs to make updates to the systems, or third parties, he added.

Sharing copyrighted data in this way with an AI system could be illegal due to the way AI companies can ingest inputs and use them as training data, Yang stressed. And the publisher would be liable for either infringing on copyright or generating infringing content, added Gary Kibel, partner at law firm Davis+ Gilbert, which advises media and advertising clients.

Meanwhile, using a tool that hasn’t been vetted can cause accuracy problems. “If you input into an AI platform, ‘If CEO Jane Doe did the following, what would that mean?’ and then the AI platform rolls that into their training data, and it comes out in someone else’s output that the CEO Jane Doe did the following… they may come to you and say, ‘How in the world did this get out? I told only you,’” Kibel said.

What are media companies doing to protect themselves?

A lot of the larger publishers have set themselves up with formal policies, principles and guardrails for newsrooms. 

Gannett has its “Ethical Guidelines and Policy for Gannett Journalists Regarding AI-Generated or Assisted Content.” It’s one of just a number of publishers that have developed these policies — others include the Guardian, The New York Times and The Washington Post

Publishers also have created internal groups dedicated to determining these principles and guidelines.

For example, Gannett has an “AI Council,” which is made up of cross-functional managers who are tasked with reviewing new AI tools and use cases for evaluation and approval. Similar task forces cropped up at companies like BuzzFeed and Forbes in 2023.

“These protocols ensure the protection of Gannett personnel, assets, IP, and information,” a spokesperson for the company said.

Educating newsroom employees on the risks that come with using AI tools not paid for or approved by their company is also key. A publishing exec — who spoke on the condition of anonymity — said the best approach is to explain the risks involved, especially by highlighting the risks personally to employees.

So far, publishers think their policies, guidelines and AI-dedicated task forces are enough to steer their newsrooms in the right direction. 

That could work, as long as those guidelines “have some teeth,” with consequences such as any disciplinary action clearly explained, according to Yang, who is a former director and associate general counsel at Adobe.

Companies can also whitelist technology that is approved for use by the newsroom. For example, The New York Times recently approved a whole host of AI programs for editorial and product staff, including Google’s Vertex AI and NotebookLM, Semafor reported.

But that’s hard to do if you’re a small publisher with fewer resources. It’s also impossible to review all the AI tools available out there that a journalist might use. The legal experts and publishing execs told Digiday they recognize the challenge of controlling how journalists use online information.

How do you police shadow AI?

You can’t. At least, not completely. But you can ensure that staff know where they and the company stand on how AI tools should be used at work. 

“It could be as simple as someone who’s running an app on their private phone,” Yang said. “How do you effectively police that when that is their phone, their property and they can do it without anybody knowing?”

But that’s where the formal policies, principles and guardrails set up by publishers can help. 

One publishing exec said they expected some shadow AI to happen in the newsroom, but was confident in the training their company was providing to employees. Their company has several training sessions a year to discuss their AI policy and guidelines, such as not to upload confidential sources and material, financial data and personal information into LLMs not approved by the company.

“I tend to trust people with making judgments in terms of the work that they do and knowing what’s good for them,” said the exec.

A Gannett spokesperson said the company has a “robust process” for approving and implementing technology across its newsroom. The company has a specific tech policy that outlines software and online services that are approved, as well as how to request access and payment of other services if needed.

“This policy helps us ensure the security and integrity of our systems and data,” the spokesperson said.

According to a recent report by AI software company Trint, 64% of organizations plan to improve employee education and 57% will introduce new policies on AI usage this year.

But another question companies should ask themselves is: why are journalists doing this?

“Maybe they’re doing it because the tools that are being made available to them are not sufficient,” Yang said. “You can lean into it and say, ‘We’re going to vet the tools, have technical protections for the tools…  and we’re going to have policies and education to make sure you understand what you can and can’t do [and] how best to use it to prevent these problems.’”

https://digiday.com/?p=571620

More in Media

How Pinterest went from selling views to selling clicks and conversions, with CRO Bill Watkins

Pinterest’s is getting louder in its battle for ad dollars with AI-powered ad tools and an increase in ad volume.

Creators and influencers on edge about Meta’s reported Reels spin-off

The notion that Meta is planning a Reels-spin off has created many questions for creators, including speculation over the potential decrease in Reels viewership, as well as concerns about whether or not Meta will allow creators to port over their Instagram followings to the new app, should the decision go through.

Illustration of a hand reaching of a computer screen to shake a man's hand.

Roblox’s ad expansion sparks backlash from creator studios

In 2025, Roblox’s relationships with creator studios have soured, according to studio creator representatives, four of whom told Digiday that they feel their relationship with Roblox has shifted from partner to competitor.