Why podcast companies are investing in AI-generated podcast translations despite questionable quality

In January, iHeartMedia announced plans to use generative AI tools to translate five to 10 existing shows into a number of different languages by the end of this quarter. But the company has pushed back that timeline. 

iHeartMedia is now shooting to launch those translated shows by the end of the first half of 2024, a company spokesperson told Digiday.

“We have been experimenting. It’s getting better. Not quite to the level we need it to be to say, ‘Let’s roll it out.’ But it’s making fast-paced gains in terms of the quality. And we anticipate probably this year that we’ll get stuff of that quality… But we’ll know a lot more by probably the next couple of earnings calls,” iHeartMedia’s CEO Bob Pittman said in a fourth quarter earnings call on Feb. 29.

While podcast networks like iHeartMedia, Spotify and PodcastOne have publicly announced plans to debut AI-generated audio translations, few have gone live yet (Spotify has released a handful of test episodes).

Execs at Spotify, which announced in September it was launching a pilot program with a few podcasters to test AI-generated voice translations, didn’t mention that program at all in the company’s Q4 earnings call on Feb. 6. A Spotify spokesperson told Digiday there was nothing new to share on that front yet.

The question of quality is leaving agency execs unsure of AI’s translation ability. One podcast ad agency exec, who spoke under the condition of anonymity, told Digiday they would only be interested in buying ads around that content if the quality was “really good” and if it had an audience — neither of which they’d seen evidence of yet. 

But they said they could see the logic behind why podcast networks were testing this. If a podcast network had a million downloads a month, for example, generative AI tools could translate shows at a relatively low cost and add an additional 100,000 impressions a month, which could be monetized with programmatic ads, they said.

However, direct translations of the colloquialisms, analogies and cultural nuances in podcast shows can be difficult to do well with generative AI tools, they said. The technology generally works by cloning a podcast host’s voice and augmenting it to read the show’s transcript in a different language.

But some podcast companies are going ahead with debuting translated shows to test how well they attract an audience. PodcastOne is working on rolling out “a handful” of podcasts translated into Spanish, according to Rob Ellin, CEO of parent company LiveOne. He did not share when the translated episodes will debut.

The company is working with AI translation company Rask AI to augment the voices of some of its podcast hosts into Spanish. PodcastOne is starting with translating the true crime podcast “Bad Bad Thing.” Ellin said the accuracy of those translations is overseen by PodcastOne’s talent and production team, but he declined to share further details.

“We haven’t proven it’s a money-maker yet, but we’ve definitely proven that the sound is good, the quality is good. We want to keep getting better,” Ellin said.

At the audio industry event Hot Pod Summit last week in New York City, The Verge’s editor-in-chief Nilay Patel tested the quality of AI-generated audio live. He played a number of audio clips and presented the audience with a six-question quiz: Was that clip AI-generated or not?

The results were mixed: the first question had three options and only 11% of attendees chose the correct AI-generated clip. Other questions ranged from a 50/50 vote split to 60-70% voting for the correct answer.

But for one question, Patel played an audio clip in Spanish — and 95% of attendees correctly said the voice was AI-generated. When Patel asked the audience how the translation sounded, one person shouted: “That was awful!”

So why are publishers testing this, if the quality isn’t up to snuff? Because it’s a cost-effective way to expand podcast shows internationally and into non-English language markets, execs said. 

“It is uneconomic to do it manually — because there’s so many episodes of so many podcasts, there’s so many languages — and AI is really the solution,” Pittman said during iHeartMedia’s recent earnings call. But he added, “How quickly we’ll be able to monetize it and get it out there? I don’t think we have any projections yet.”

Some podcast hosts disagree with the cost-savings argument. Marshall Poe, podcast host and founder of New Books Network, translates a number of his company’s shows into Spanish the old-fashioned way — by hiring Spanish-speaking podcasters. Forty percent of New Books Network’s listeners are not in the U.S., according to Poe. But he doesn’t think using AI tools would save much time or money, because he’d need people to verify the accuracy of the translated podcasts anyway, he said.

Translating podcast hosts’ voices into different languages would also likely require podcast networks to renegotiate contracts with those hosts, noted Aricia Skidmore-Williams, co-host of three Wondery podcasts including “Even the Rich.”

“I would want to have some kind of assurances if… AI is going to be a part of our contract,” she said.


More in Media

AI Briefing: Senators propose new regulations for privacy, transparency and copyright protections

A new bill called the COPIED Act aims to pass new transparency standards to protect IP and guard against AI-generated misinformation.

Media Briefing: Publishers reflect on ad revenue midway through 2024 

Some publishers say ad revenue is pacing 15% up year over year while others are still managing their expectations for how 2024 will shake out.

Teads is exploring sale options as M&A in ad tech heats up

Sources state the Altice-owned stalwart of outstream video has recently held talks with private equity and strategic players.