Why The Guardian’s first reader-facing AI product isn’t a chatbot
This article is part of Digiday’s coverage of its Digiday Publishing Summit. More from the series →
Subscribe: Apple Podcasts • Spotify
The Guardian didn’t want to build an AI chatbot. Not a reader-facing one anyway. Not at the risk of that chatbot misrepresenting the news publisher’s journalism and undermining readers’ trust.
“We’re not going to die if we don’t build a chatbot tomorrow. We need to be really clear about what the threats are externally, but ultimately what we have is something that’s worth protecting,” said Chris Moran, head of editorial innovation at The Guardian, during a live recording of the Digiday Podcast at the Digiday Publishing Summit in Vail, Colorado, on March 23.
While not a chatbot, The Guardian has begun to roll out its first reader-facing AI product. But it doesn’t really look like an AI product.
Called Storylines, the product is an AI-generated spin on the related links module common to publishers’ pages. It currently appears on a subset of The Guardian’s so-called “tag” pages, which typically list articles related to a given topic, such as “Trump,” in reverse-chronological order. Amid this article feed is an unassuming box with a selection of related articles threaded to a given narrative or storyline.
“It’s generating from a list of the most recent 200 articles on this tag what it thinks the three big storylines are across those articles. And it’s really good technology for this. It’s good at applying narrative to something and decoding it,” said Moran.
The Storylines box is clearly labeled as AI-generated, though the only text generated by AI are three subtitles across the top of the box that define the three related storylines.
“It’s a curatorial tool, fundamentally. What it’s designed to do is highlight the journalism and showcase the journalism,” Moran said.
To ensure the AI tool accurately highlights the journalism, The Guardian does not provide the large language model with the actual article text. Only the headlines. That way the publisher controls the context of the information that the LLM can work with to limit the potential for hallucinations.
“What you’re saying to the machine is ‘I want you to pay attention to what we said.’ This is about what our human editors said,” said Moran. He added, “it stops the machine from noticing a reference to Brad Pitt in the 17th paragraph of something and assuming [the entire article] is about that.”
To further rein in the LLM, The Guardian had a team of 20 senior editors evaluate multiple instances of its output. “We were asking them to look at a set of at least 10 of these across the course of two weeks and giving us quite detailed feedback,” Moran said. The Guardian’s data science team then applied that feedback to model updates.
Even after all that, The Guardian’s Storylines tool remains in limited testing on only 10 tag pages. The publisher expects to roll it out more broadly, but never across its entire corpus.
“We’re never going to turn this on on 27,000 tag pages. And there are certain kinds of tag pages we’re just going to avoid,” Moran said. And of course, given the atomic threat of generative AI to the media industry, there is the nuclear option. Said Moran, “We obviously have a very large red button which says, ‘Turn this off.’”
More in Media
Time pitches GEO insights into a new brand offering
Time is turning its AI insights into a new product, selling branded content to shape how brands are talked about inside AI-generated answers.
CreatorIQ and Sprinklr bet they can solve creator measurement’s fragmentation problem
CreatorIQ and Sprinklr are joining forces to bring creator intelligence, social media management, and paid amplification onto a single platform to try and solve a creator marketing problem.
After newsroom cuts, The Washington Post turns to creator-led video deals
The Washington Post is betting on creator content to open up new revenue and audiences, after newsroom layoffs.