The Big Data Accuracy Myth

Amit Avner is the CEO of Taykey, a media technology company that identifies target audiences based on their real-time interests.

The adulatory hoopla over big data and real-time bidding, as exemplified in “Bye Bye, Traditional Media Buying,” is premised on many highly debatable notions, most notably that “big data” will eliminate waste in advertising.

It has been noted repeatedly that the foundational units of “big data,” the cookie and the look-alike model, are often extraordinarily inaccurate. According to the anonymous ad tech executive who confessed in Digiday this spring, “We’ve seen agencies run tests against the validity of cookies on a data exchange. The gender is wrong 30-35 percent of the time.” Targeting either gender at random is wrong 50 percent of the time, so this is an improvement but hardly an eliminator of waste.

The problem only gets worse when you add additional filters to get a more specific audience. Some of this is for obvious reasons — people share computers, so you’ll never know at any given moment whether it’s my girlfriend or me that your algorithm has bought — but some of it is just the nature of the platform. Quantcast published a white paper that noted that “the half-life of an average third-party cookie … is approximately three days, and cookies for one third of online users last for less than an hour.” Finally, there is a fundamental flaw with the idea of using historical browsing data to predict future interests and behavior. As Jeff Hawkins said in a recent New York Times piece, “It only makes sense to look at old data if you think the world doesn’t change.” Many of the look-alike models the RTBs rely on to target more dimensional audience profiles are backward-looking, expensive to update and are rarely validated.

It is not generally in either a data provider’s or an agency’s interest to call the data into question. But if agencies were to run a small quantity of their media in the form of surveys to validate targeting and hold their data providers financially accountable to a minimum level of accuracy (call it MLA so we can have another acronym), one of two things would happen. Most likely, the data providers will balk and thereby reveal their own level of confidence in their data. But the better outcome would be an improvement in the methodologies used to identify audience that would make the data (not just the bidding) more real-time. The result might be a smaller pool of audience that could be bought with more confidence.

The other solution is a blunt instrument that puts a lot of companies out of business but will absolutely improve data accuracy: the Facebook ad network, where all targeting is based on declared data and there are few look-alike models, only (anonymous) individuals that can be bought on the basis of granular knowledge and not inference. Even Facebook will need to be more agile — that you liked Nokia two years ago doesn’t explain the Galaxy in your pocket today — but if it drops an identifier with every login, it solves the computer-sharing problem, the cookie-deletion problem, the mobile-targeting problem and almost all other big data problems in one swoop. Pricing may go up, but so will quality. Advertisers are used to accepting waste, but when companies figure out how to best buy Facebook targeting based on what audiences are engaged with in real time, it will be possible to reach audiences at a lower cost by following unpredictable flows of topical interest and virtually eliminate waste. Then we won’t be talking about Big Data; we’ll be talking about Good Data.

More in Marketing

Why the New York Times is forging connections with gamers as it diversifies its audience

The New York Times is not becoming a gaming company. But as it continues to diversify its editorial offerings for the digital era, the Times has embraced puzzle gamers as one of its core captive audiences, and it is taking ample advantage of its advantageous positioning in the space in 2024.

Why B2B marketers are advertising more like consumer brands to break through a crowded marketplace

Today’s marketing landscape is more fragmented than ever. Like consumer brands, business brands are looking to stand out in a crowded and competitive marketplace, making marketing tactics like streaming ads, influencers and humorous spots more appealing.

As draft puts WNBA in spotlight, the NBA is speeding up ballplayers’ transition to creators

The NBA’s star athletes are its greatest marketing asset.