How publishers like The Marshall Project and The Markup are testing generative AI in their newsrooms

By Sara Guaglione • August 29, 2023 •

Ivy Liu

Publishers including The Marshall Project and The Markup shared how reporters are using generative artificial intelligence in their newsrooms in their reporting processes — after some failed tests.

The presentations were held at this year’s Online News Association four-day conference, which took place in Philadelphia from Aug. 23-26. The event had more than half a dozen sessions dedicated to the emerging technology.

Andrew Rodriguez Calderón, a computational journalist at The Marshall Project, and Mark Hansen, a professor at the Columbia School of Journalism, outlined ways they experimented with ChatGPT for journalism — and how they had to tweak their prompts to get what they wanted from OpenAI’s generative AI chatbot tool.

Calderón tried to use ChatGPT to generate summaries of banned book policies in different states, to save time on manually extracting that information from reporters’ notes. He asked ChatGPT to create summaries of those notes, which resulted in lackluster paragraphs. So his team iterated on the ChatGPT prompts to create descriptions with subheads of those notes, and then asked ChatGPT to group the relevant parts of the policies under those specific subheads. Two people fact-checked those descriptions.

Calderón said he believed this process saved him time from the cumbersome task of manually extracting information from those notes, so he could focus on fact-checking and formatting the information. He stressed the importance of documenting ChatGPT prompts to have a log of what iterations worked best for templates on future projects, sometimes referred to as “prompt libraries.”

Columbia School of Journalism’s Hansen used ChatGPT to extract numbers from daily paragraphs from New York State’s system on monkeypox virus numbers, to find trends and spikes. It took some tweaking to get ChatGPT to understand he was looking for the biggest changes in the data and then to help create a template for a story on those findings.

“These are examples of how you have to get quite specific and granular to figure out which tasks [ChatGPT] is actually usable for and saves you time,” said Gideon Lichfield, former editor in chief of Wired, who moderated the panel. “There is effectively a whole programming language and a programming culture emerging around GPT. But unlike traditional coding languages, it’s imprecise. What makes a prompt more likely to lead to reliable results is a bit of an art.”

Sisi Wei, editor in chief at The Markup, said their newsroom policy is journalists are not allowed to input unpublished drafts into ChatGPT, out of worry that the information gets fed into the large language model without their control over where it goes and how it’s used.

But Wei does input published headlines into ChatGPT to see if it can generate better alternatives. So far, it’s been “affirming,” she said, because all of the headlines it’s generated have been worse than the published ones.

https://digiday.com/?p=516143

More in Media

Member Exclusive

Media Briefing: ‘Cloudflare is locking the door’: Publishers celebrate victory against AI bot crawlers

July 3, 2025

After years of miserably watching their content get ransacked for free by millions of unidentified AI bot crawlers, publishers were finally thrown a viable lifeline.

Publishing in the Platform Era

How Vogue could navigate potential industry headwinds as Anna Wintour — who agency execs say made ad dollars flow — brings on new edit lead

July 3, 2025

Anna Wintour’s successor at Vogue will have to overcome the myriad of challenges facing fashion media and the digital publishing ecosystem.

Generative AI

Here are the biggest misconceptions about AI content scraping

July 2, 2025

An increase in bots scraping content from publishers’ sites represents a huge threat to their businesses. But scraping for AI training and scraping for real-time outputs present different challenges and opportunities.