Explainer: Dirty Data

What It Is: Dirty Data has been collected or processed in such a way as to make it irrelevant or misleading for analysis purposes. Dirty data may be the fault of inaccurate data transfers to customer relationship management platforms, manual entry errors, or outdated customer details from third-party processors.

How This Happens: Companies that trust critical data for building actionable insights to third-party data collectors that don’t offer transparency along with data management services are taking a big risk. Consistently listed as one of the top problems that data miners encounter when interpreting data according to Rexer Analytic’s annual survey of consumer analytics providers, dirty data permeates every aspect of a brand’s decision-making processes, from marketing to CRM. Companies may receive poor quality data from third-party analytic companies which may in turn outsource data collection to companies with less than stellar data management practices or which may use data that isn’t timely or not truly relevant.
Who Has Used It: Separate international studies by Experian and DemandGen estimate that more than 62 percent of companies have been impacted by the use of consumer data that is 20 to 40 percent inaccurate or incomplete, resulting in the misappropriation of 10 percent or more of marketing budgets. A recent study by Business Week reported that 40 percent of C-level executives believe that their companies’ consumer-related data is flawed.
Why It Matters: Data quality is a key determinant in developing marketing strategy that optimizes ad budgets and enables brands to connect with consumers at the right moment online. A 2010 study by Forbes of companies worth $500 million and above revealed that more than 60 percent of those surveyed said that flawed consumer-related data cost their companies between $5 million and $20 million per year. While many audience management platforms offer real-time data from a variety of sources, it is crucial to be able to follow the data trail back to the original collection to verify that data collection practices are properly managed, in-line with industry-wide privacy standards and truly relevant to business goals.

More in Media

News publishers hesitate to commit to investing more into Threads next year despite growing engagement

News publishers are cautious to pour more resources into Threads, as limited available data makes it difficult to determine whether investing more into the platform is worth it.

privacy sandbox

WTF is Google’s Protected Audience?

FLEDGE stands for ‘First Locally-Executed Decision over Groups Experiment’ and makes ad auction decisions in the browser, rather than at ad server level.

Digiday’s History of Ad Tech: In the beginning …

A look at the genesis of ad tech, from the first online display ad in 1994 to the dotcom crash.