In statistics and computer science, the term “GIGO” stands for “garbage in, garbage out.” No matter how brilliant your employees are, they won’t be able to turn low-quality data into accurate models. If you start with inaccurate data, your results will be equally fallacious.
Poor data-collection practices today could lead the industry to turn its back on the use of data. Some say, buying data “just isn’t all it’s cracked up to be.” One agency rep recently said to me, “If buying data alone costs 50-cents CPM and a run of site placement costs the same, then the data should double the performance, and it very rarely does.” The upside in this scenario is that agencies today are analyzing media buys in a more intelligent way, in an effort to increase ROI. At the same time, data is getting a bad rap from companies that are not employing the best data-collection methods.
A primary focus of companies that collect, organize and sell data is scale and efficiency. The goal is to obtain as much data as quickly and as easily as possible in order to maximize revenue, in a manner that is privacy-safe. While scale sometimes comes naturally from data sources, other times companies rely on potentially inaccurate methods to infer behavioral or profile-based information. In addition to utilizing inaccurate methods to achieve scale, companies employ careless methods to achieve efficiency. Common overly simplistic methods to collect declared and demonstrated data include extrapolating assumptions from URLs or metatags.
URL data-collection methods often result in broad-based assumptions based on terms included in the URL. For example, if a user visits the travel section of a news site, the URL would tell the data collector that they are interested in travel and news, but wouldn’t necessarily delve into the fact that the pictures the user was looking at are of the 2012 Olympic stadium in London. That information would help further define that person’s interests in Sports and, in particular, Olympic sports – something a lot of advertisers are looking to target in and around the Olympic season.
Metatag data-collection methods are heavily reliant on the person or team that built the website. In some cases, the website builder pastes the same 200 meta keywords in the source code to describe every page on the site, which eliminates any unique data about a visitor’s behavioral habits. Metatag description words could be unrelated to the content on the page all together. For example, metatags could say recent news, and the content could be about travel. Metatag methods can also produce extremely broad results or only include the title of the webpage which barely scratches the surface of a user’s declared and demonstrated data.
Data collection is at the core of many companies’ business models, including data management platforms, ad networks and data resellers. Employing poor, overly simplistic and/or inaccurate data-collection methods to attain scale and efficiency often result in negative downstream effects on performance.
The objective of behavioral data collection is to figure out who the consumer is, how they feel, and what they are really interested in. Accurately understanding this information will ultimately improve the consumer’s online experience and make the advertising they are exposed to more meaningful, which means better performance for the brand’s marketing dollars.
Pairing data with mathematical modeling – provided that both are high quality – produces the best results. While data plays an integral role in the advertising industry, there appears to be a lack of education and awareness around data-quality controls and standards. Companies in our industry need to put more time and effort into developing and implementing data collection processes that combine technology with manual efforts to produce accurate, consistent, high-quality, scalable data. Quality data produces a significant increase in performance and ROI. Isn’t that what advertising’s all about?
Doug Pollack is the director of data analytics for Lotame.
https://digiday.com/?p=1690