How the Guardian shines a light on dark traffic

To monetize traffic, publishers need to understand its audience’s route, profile, and then value. Dark traffic, where the source is unknown, throws this process into confusion. It is an increasingly worrying concern due to growing numbers of readers arriving at publishers on mobile and through social side doors.

The Guardian has a way to crack some of dark traffic’s mysterious origins: Ophan, the publisher’s in-house analytics tool, created in 2012 and now used by 1,000 employees each month, is a sophisticated instrument that measures reader behavior in real time — and allows the publisher to understand where some of them are coming from.

The way Ophan works is that it can pinpoint some dark traffic by the referral pattern on individual articles. If, for example, it sees a lot of Reddit and a lot of unknown sources, it can infer that a chunk of the latter is from various Reddit apps on mobile that don’t have referral links. In the U.S., when a surge of unknown traffic was noticed in a 3-month-old story, the Guardian was able to match the spike to referrals from the The Skimm newsletter.

It turns out, a lot of the Guardian’s dark traffic is coming from readers’ phones’ home screens.

“Right now the biggest unknown comes from Spotlight search,” said Chris Moran, audience editor at the Guardian, and one half of the duo who created Ophan. Readers access Spotlight search when they swipe right from the iPhone’s home screen to get a few links to recommended content. These articles have little to no advertising and very pared-down branding.

Initially, Android was the suspected culprit, though.When users read in-app stories on their Android devices, they open in a Chrome browser, stripping out the referral links. But when Spotlight search was introduced, Ophan showed that around 89 percent of the unknown traffic was coming from iOS devices, allowing Moran and his team to identify this as the source.

But Ophan can’t always suss out dark traffic’s murky origins. This recent Guardian article, “The $400,000-a-year New Zealand job with three months’ holiday that no one wants,” saw 20 percent of its traffic come from unknown sources, a figure Ophan hasn’t yet cracked. A typical Guardian story will see much less of its traffic — 10-15 percent — come from unknown sources. The publisher suspects a lot of it is from readers emailing and texting links to each other.

“It will only get worse. It’s just human behavior, and it’s not going to go away,” said Mark Syal, head of media practice at agency Essence. “Copying a link and sending it to someone is as common, if not more so, as using a share button.”

For most publishers, there’s little that can be done, apart from cross-referencing with other data to make assumptions on what the audience looks like, then assigning a certain value or behavior to them.

“It’s not just a referral agency problem; it’s a user problem deep down,” points out strategy director David Carr at DigitasLBi. “The journey is broken for users who need to jump around in apps and on browser, because ad tech is getting in the way. That strips out the attribution. What could be a connected journey becomes disjointed. If we solve it for the user, we might solve the attribution.” But that’s easier said than done.

More in Media

Media Briefing: Step by step, publishers are building toward an agent-led ad business

Agentic AI-driven media trading could wipe out a lot of the problems caused by its programmatic predecessor. Namely, ad tech middlemen.

In Graphic Detail: How AI search is changing publisher visibility

AI platforms like ChatGPT and Google AI Mode are driving more search activity. Some publishers are gaining visibility — but not traffic.

AI royalties for small and midsize publishers: collective licensing’s next big play

Don’t credit OpenAI’s ChatGPT, credit corporate LLMs – enterprise RAG is what’s creating royalty revenue for publishers.