WTF is data leakage?

By John McDermott • January 27, 2015 •

When Google decided it would stop letting data-management platforms fire pixels on its ad network, it claimed that its motivation was to bring an end to “data leakage.”

Which raises the question: What is data leakage, anyway?

Below is a primer — explained as plainly as possible — on what data leakage is and why it’s a pressing issue.

So WTF is data leakage?
Data leakage typically occurs when a brand, agency or ad tech company collects data about a website’s audience and subsequently uses that data without the initial publisher’s permission.

What kind of data?
As people peruse the Web, they leave behind crumbs of data about themselves, such as the websites they routinely visit and the type of Web browser they use. Some websites collect more detailed information on users by having them complete user profiles. Pandora has users provide their age, gender and zip code, for instance. This information is used to target ads at specific demographic segments.

How does it leak from there?
The leakage occurs when a third party uses a tracking pixel to collect this data from a website. A pixel is an invisible piece of code that marketers attach to monitor its performance and ensure it’s reaching the intended audience. But pixels can also collect information about a publisher’s audience. Once collected, the third party can target those visitors on other websites.

So why would a publisher allow pixels to run on its site, then?
Pixels are an integral part of digital advertising, and most data leakage is benign, according to Paul Dolan, evp at Xaxis, a programmatic ad-buying firm. Marketers use pixels for frequency capping, limiting the number of times a person sees a given ad, for instance. But as always, there are bad actors.

What’s the big deal?
Publishers are concerned data leakage will cause their ad inventory to depreciate.

How?
Here’s a hypothetical: GQ has a large number of affluent, fashion-conscious male readers, making its ad inventory valuable to luxury watch brand Tag Heuer. Tag Heuer runs an ad campaign on GQ at a $5 CPM and uses tracking pixels to identify the people who have seen the ad. Tag Heuer can then target those users on sites that aren’t GQ. Instead of having to pay a $5 CPM for ads on GQ, it serves ads to those same users at $1 CPM on an ad exchange.

What’s so important about data anyway?
Ad networks and exchanges have allowed advertisers to buy audiences regardless of where they appear instead of having to negotiate directly with individual publishers (and pay them a premium). This process is data-driven, and the richer the data set, the more valuable it is for an advertiser.

So the more data, the merrier, right?
It depends. The fear among publishers is that advertisers, agencies and ad tech providers are going to pirate their audience data, effectively negating the value of their ad inventory. “Buying is not about media now; it’s about data,” according to Oleg Korenfeld, svp of advertising technology and platforms at media agency Mediavest.

Is it only publishers that are affected?
No. Although data leakage is frequently cited in relation to publishers, it can negatively affect a brand, as well.

Dolan offered the following hypothetical: A group of Internet users shop for mobile phone plans on AT&T’s website. AT&T identifies those users as being in the market for a new plan and uses an ad exchange to retarget them with ads. Once this occurs, however, the exchange could sell that audience data to Verizon, allowing Verizon to co-opt AT&T’s customers by using data that originated with AT&T.

Yikes. Are there any potential solutions?
Dolan said one of the best ways to prevent data leakage is for advertisers, agencies, ad tech companies and publishers to have legal contracts that stipulate who owns what data and how other parties can use it.

Is that far-fetched?
Oscar Garza, director of programmatic at digital agency Essence, said he and his agency would like to see publishers start licensing their data to marketers. It would open a new revenue stream for them while also allowing marketers to better target and track campaigns. “Data leakage is a pejorative right now, but it’s actually an opportunity for the industry to adopt more democratic uses for publisher data off of those publishers’ networks.”

What about “private” exchanges?
Some publishers have responded to data leakage concerns by establishing private marketplaces, which allow them to sell ad inventory programmatically but only to a select group of advertisers. In those instances, only the publisher and advertiser — and not the ad exchange — are able to see the audience data associated with the transaction. Private marketplaces are imperfect solutions, however — the advertiser could still co-opt the publisher’s audience data, according to Tom Chavez, CEO at Krux, a data-management platform.

So what’s Google have to do with it?
In October, Google announced it would start enforcing a pre-existing policy that states data-management platforms such as Krux could no longer run pixels on Google’s ad network. Google said the decision was to prevent data leakage on its network, but some brand and agency execs said it was a red herring aimed at having marketers use its suite of ad tech products instead and that Google was developing a data product of its own.

WTF is data leakage?

More in Media

Beehiiv adds even more features to go up against competitors and win over creators

Media Briefing: As traffic declines, publishers see gains in commerce conversions and CTR

Vibes over metrics: Why more creators are holding IRL events to own their audience