Home Data-Driven Thinking Distinguishing Good Data From The Bad

Distinguishing Good Data From The Bad

Nish Desai headshot

Data-Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media.

Today’s column is written by Nish Desai, senior director of technology, operations and partnerships at Xaxis.

Marketers have ever-growing streams of data and signals they can use to activate and optimize their advertising campaigns. But using this data to execute well is only half the battle.

To hit their mark, brand marketers must make their data both reliable and extensible to as many places as possible so they can pair it with partners’ first-party data, then fill any gaps with properly vetted third-party data. This means they need to ensure its proper collection, storage and upkeep.

Brands should investigate data collection methodologies to determine the data sets are complete, consistent and representative of the segments they are trying to reach. Properly vetting data requires human interaction, not just a technological solution.

Key considerations

What are my objectives? Before building a data set, brands need to ask themselves why they are building it. Where will it be used and for what purpose? This may seem counterintuitive – starting at what seems like the end – but understanding the reason for the data set’s creation will help ensure that the data that goes into it is correct.

What data is available and where did it come from? Understanding how the available data was collected can help determine how it will be used and how much value it will bring to the data set.

Where and how is the data stored? Data can be stored in house or by a partner. Often, centralizing the data in a data lake may be in the brand’s best interest.

Knowing how your data is stored is equally as important as where it is stored. To get the most value from the data, it must be current. Knowing how often data is refreshed is vital.

As privacy regulations emerge in more markets, it is essential that all data is collected and stored in a privacy-compliant manner.


AdExchanger Daily

Get our editors’ roundup delivered to your inbox every weekday.

What data is missing? Once brands have identified what data is available, they’ll likely find gaps that need to be filled. Ask potential partners direct questions and listen carefully to the answers. Vague and unclear definitions are a warning sign.

How is the data collected? To determine how a partner knows its first-party data is accurate, a brand may ask how phone numbers, ZIP codes (preferably plus six digits) or email addresses are collected and tested. It would be a positive sign if the information comes from users’ self-declared registration data and there are 100,000 users in the data pool who match the segment desired by the brand. The confidence score for that kind of deterministic first-party registration data is generally higher than 90%. Having users proactively express who they are, what they like and consistently using a login boosts confidence that the data held by a platform or publisher is sound.

But brands should be cautious if the publisher or platform touts a complicated methodology used to deduce someone’s identity. For data derived by probabilistic means, the confidence level is nearly always below 85% and is often closer to 50%.

How are the data sets structured? Is the data merged with data from other parties? Understanding how data is structured after it is collected is extremely important. A brand needs to ensure that any partner data is in an apples-to-apples configuration with their own so that it can be easily merged. If other parties are involved in the sourcing of the data, the brand may need to inquire about their collection methods.

How is the data kept current? Any data based on interests or other attributes that can change over time should be refreshed periodically. Knowing how and how often these attributes are refreshed is key.

How is the data kept clean? If personally identifying information (PII) is collected, understanding how it is sanitized is essential. Is a clean room used to ensure that PII is removed? If values are being hashed or encrypted, understanding how this is done will help ensure that the brand is complying with the requisite privacy standards and industry best practices.

Don’t forget the other huge added benefit to making sure data is good: It helps brands better prepare for when third-party cookies are phased out.

Digital marketers preparing for that day know they have to build and test their first-party data stores to be as ready as possible. They need to have the best data they can and build on that to mix, match and build segments within the social platforms, Google’s Privacy Sandbox and Ads Data Hub, and to match with publishers’ data warehouses. Marketers and their partners need to use best practices in gathering and maintaining their data to keep data stores clean, accurate and current.

Follow Xaxis (@XaxisTweets) and AdExchanger (@adexchanger) on Twitter.

Must Read

Advertible Makes Its Case To SSPs For Running Native Channel Extensions

Companies like TripleLift that created the programmatic native category are now in their awkward tween years. Cue Advertible, a “native-as-a-service” programmatic vendor, as put by co-founder and CEO Tom Anderson.

Mozilla acquires Anonym

Mozilla Acquires Anonym, A Privacy Tech Startup Founded By Two Top Former Meta Execs

Two years after leaving Meta to launch their own privacy-focused ad measurement startup in 2022, Graham Mudd and Brad Smallwood have sold their company to Mozilla.

Nope, We Haven’t Hit Peak Retail Media Yet

The move from in-store to digital shopper marketing continues, as United Airlines, Costco, PayPal, Chase and Expedia make new retail media plays. Plus: what the DSP Madhive saw in advertising sales software company Frequence.

Privacy! Commerce! Connected TV! Read all about it. Subscribe to AdExchanger Newsletters
Comic: Ad-ception

The New York Times And Instacart Integrate For Shoppable Recipes

The New York Times and Instacart are partnering for shoppable recipe videos.

Experian Enters The Third-Party Data Onboarding Business

Experian entered the third-party data onboarder market on Tuesday with a new product based on its Tapad acquisition.

Albertsons Takes Its First Steps Into Non-Endemic Advertising, Retail Media’s Next Frontier

Albertsons is taking that first step into non-endemic advertising next week via a partnership with Rokt to serve ads to people who have already purchased groceries.