Why data mined from social media alone is garbage
Rudi Anggono is head of creative at Google in New York.
I can hear the groans already. Another article about data? Don’t worry. I’m not here to talk about ROI, KPI or any other I, which is usually how data is discussed. I want to talk about other aspects of data that may not be obvious to some people — especially data collected in the social media space. It’s a two-faced, insincere, duplicitous, lying sack of shit.
Consider declared data.
Declared data is the perfect vacation selfies you post on Instagram, the adorable baby video you upload, the numerous “likes” you give, the witty remarks you leave, the polite white lie you tell your waiter even if you hate the food. You get the idea. It’s influenced by your mood, your prejudices, your political agenda, your insecurities, which shape your carefully (or not so carefully) crafted public image. And it’s not entirely reliable because it’s mostly made of half-truths.
For example, just because I give a “like” to a cute cat video, it doesn’t mean I like cats. In fact, I hate cats. I do that to show my support to the poster. Because I know he just lost his partner and that cat is the only living thing that ties him to his loved one. This is a context that the act of liking (the declared data) fails to recognize. Yet it’s a crucial context. Or I may comment on an issue that I don’t particularly care about, but I do that to appear smart. I may even write in the poster’s native language even though I don’t speak it.
It’s fake. It’s insincere. Yet this is the data people declare to the world. Because it’s what humans do. We lie about a lot of things. Renowned cultural anthropologist Genevieve Bell once said we lie because we want to tell better stories, to project better versions of ourselves. It’s part of our genetic make up as political animals to be accepted and survive. Unfortunately, these lies are being captured as data. Declared data. A lying sack of shit.
Which brings us to intent and behavior.
Typically, in dealing with most lying sacks of shit, you look for the intent. You confirm that intent through the behavior. The problem is intent is not always obvious. You have to cast a much wider net, beyond the environment where the data is declared, so you can extrapolate and cross-reference. Search data is a good place to start because people search with intents, but it’s not always enough. You need to look at other data points to collect more intents and behaviors.
If I like the cute cat video, and at the same time I’m searching for “the loss of long-term partner,” it’s not enough to link those two data points to provide context. But if you also know that I’ve been watching videos on YouTube about dying patients with their pets, while also shopping for books at Amazon about coping with loss and caring for pets after the owner passes away, then these four data points will start showing you the fuller picture, the context.
These are digital behaviors, fueled by intents. The initial declared data of “liking” a cat video becomes more than just that. It’s deeper. This is where a much-improved algorithm for data-driven prediction (aka machine learning) combined with human intuition come in handy.
The more you know.
We should then use both declared data in concert with intent and behavior. True, in most cases, you’ll hear arguments for one or the other, which is a little bit like a pineapple arguing with a banana about which one is the real fruit.
But whether you’re a marketer, a political strategist, an agency planner or an intern researching a paper, you can’t trust data based on what people tell you. It’s almost always a lie. You have to prod, extrapolate, look for the intent, play good cop bad cop, get the full story, get the context, get the real insights. Use all the available analytical tools at your disposal. Or if not, get access to those tools. Only then you can trust this data.
As brands test Amazon’s direct link between digital ads and Whole Foods purchases, they spot new data nuggets — and gaps
Advertisers are using Amazon's recently-launched attribution feature to see how digital ads drive real-life purchases at Whole Foods.
‘I don’t think their outlook is dire’: Why a return to in-person events hasn’t deterred marketers on Clubhouse
Even with vaccine rollout and a return to in-person events, marketers still see the value in Clubhouse communities formed during the coronavirus pandemic.
Some agencies are giving summer Fridays a second look so employees can ‘enjoy their life again’
Agency executives say that the increased focus on taking advantage of summer Fridays this year is to help employees take the time they need to relax, improve mental health after a difficult year and avoid burnout.
SponsoredHow The Company Store is reimagining customer experiences for pandemic-era growth
Throughout the pandemic, some retail categories have been inherently successful. Home furnishings and décor are among them; with consumers spending so much more time at home, updates and renovations flourished. Criteo data from the first half of 2020 showed sales for items like outdoor furniture sets up 434% year over year, with other home items […]
‘Bridge the gap between paid and organic’: Why Reddit is building an in-house agency to work with brands
Reddit is building an in-house creative strategy agency, KarmaLab, in a broader bid in recent years to grow its ads business.
Cheat Sheet: Why Roblox is fast becoming one of the most important media businesses of the future
Gamers spent 9.7 billion hours on Roblox in the first three months of the year. Here's a look at how the company became one of the most important media businesses of the future.