Home Data-Driven Thinking Distinguishing Good Data From The Bad

Distinguishing Good Data From The Bad

SHARE:
Nish Desai headshot

Data-Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media.

Today’s column is written by Nish Desai, senior director of technology, operations and partnerships at Xaxis.

Marketers have ever-growing streams of data and signals they can use to activate and optimize their advertising campaigns. But using this data to execute well is only half the battle.

To hit their mark, brand marketers must make their data both reliable and extensible to as many places as possible so they can pair it with partners’ first-party data, then fill any gaps with properly vetted third-party data. This means they need to ensure its proper collection, storage and upkeep.

Brands should investigate data collection methodologies to determine the data sets are complete, consistent and representative of the segments they are trying to reach. Properly vetting data requires human interaction, not just a technological solution.

Key considerations

What are my objectives? Before building a data set, brands need to ask themselves why they are building it. Where will it be used and for what purpose? This may seem counterintuitive – starting at what seems like the end – but understanding the reason for the data set’s creation will help ensure that the data that goes into it is correct.

What data is available and where did it come from? Understanding how the available data was collected can help determine how it will be used and how much value it will bring to the data set.

Where and how is the data stored? Data can be stored in house or by a partner. Often, centralizing the data in a data lake may be in the brand’s best interest.

Knowing how your data is stored is equally as important as where it is stored. To get the most value from the data, it must be current. Knowing how often data is refreshed is vital.

As privacy regulations emerge in more markets, it is essential that all data is collected and stored in a privacy-compliant manner.

What data is missing? Once brands have identified what data is available, they’ll likely find gaps that need to be filled. Ask potential partners direct questions and listen carefully to the answers. Vague and unclear definitions are a warning sign.

How is the data collected? To determine how a partner knows its first-party data is accurate, a brand may ask how phone numbers, ZIP codes (preferably plus six digits) or email addresses are collected and tested. It would be a positive sign if the information comes from users’ self-declared registration data and there are 100,000 users in the data pool who match the segment desired by the brand. The confidence score for that kind of deterministic first-party registration data is generally higher than 90%. Having users proactively express who they are, what they like and consistently using a login boosts confidence that the data held by a platform or publisher is sound.

But brands should be cautious if the publisher or platform touts a complicated methodology used to deduce someone’s identity. For data derived by probabilistic means, the confidence level is nearly always below 85% and is often closer to 50%.

How are the data sets structured? Is the data merged with data from other parties? Understanding how data is structured after it is collected is extremely important. A brand needs to ensure that any partner data is in an apples-to-apples configuration with their own so that it can be easily merged. If other parties are involved in the sourcing of the data, the brand may need to inquire about their collection methods.

How is the data kept current? Any data based on interests or other attributes that can change over time should be refreshed periodically. Knowing how and how often these attributes are refreshed is key.

How is the data kept clean? If personally identifying information (PII) is collected, understanding how it is sanitized is essential. Is a clean room used to ensure that PII is removed? If values are being hashed or encrypted, understanding how this is done will help ensure that the brand is complying with the requisite privacy standards and industry best practices.

Don’t forget the other huge added benefit to making sure data is good: It helps brands better prepare for when third-party cookies are phased out.

Digital marketers preparing for that day know they have to build and test their first-party data stores to be as ready as possible. They need to have the best data they can and build on that to mix, match and build segments within the social platforms, Google’s Privacy Sandbox and Ads Data Hub, and to match with publishers’ data warehouses. Marketers and their partners need to use best practices in gathering and maintaining their data to keep data stores clean, accurate and current.

Follow Xaxis (@XaxisTweets) and AdExchanger (@adexchanger) on Twitter.

Tagged in:

Must Read

The Trade Desk Has A Grand Vision, But Needs A New Breed Of CMO To Make It A Reality

The Trade Desk CEO Jeff Green laid out the DSP’s plan for winning in a new world of advertising that, AI aside, would necessitate major changes in how marketers behave in the market today.

A Publisher Didn’t Get Its UID2 Setup Right. The Trade Desk Didn’t Notice. What Went Wrong?

TTD confirmed that this CTV publisher’s errors would have made its UID2s useless for ad targeting. But TTD also said it wouldn’t have had enough information to flag the issue.

Criteo Faces Tough Headwinds Until Agentic AI Ad Revenue Materializes

Criteo shares dropped by 20% Wednesday morning after the company reported shaky Q1 earnings and revised its guidance downward for the rest of the year.

Privacy! Commerce! Connected TV! Read all about it. Subscribe to AdExchanger Newsletters

Disney’s New CEO Is Focused On Two E’s: Engagement And ESPN

On Wednesday, Josh D’Amaro led his first earnings call as the new CEO of Disney. The company closed last quarter with $25.2 billion in revenue, a 7% year-over-year increase. Disney Entertainment advertising revenue rose 5% YOY, but ESPN ad revenue was down 2% YOY, although subscription and affiliate revenue was up 6%.

People Inc. Looks Inward For Growth As Its Search Traffic Downsizes

People Inc. previewed plans to downsize by focusing mainly on its key properties. The strategy makes sense considering its publishing portfolio has lost about two-thirds of its Google traffic.

Kamran Asghar, Global CEO & Co-founder, Crossmedia

POSSIBLE 2026: Industry Experts Dish On AI – And Other Trends To Watch

At POSSIBLE 2026 in Miami, the ad industry was over the hype around AI.