The Future Of Marketing Data: Accuracy Trumps Everything

Data-Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media.

Today’s column is written by Auren Hoffman, CEO at SafeGraph.

People say “data is the new oil.” But unlike oil, the quality of data is extremely difficult to measure.

In my experience, most marketing data is between 10% and 20% accurate. Data that is less than 10% accurate generally doesn’t perform. Data that is better than 20% accurate has decreasing audiences to sell. For every percentage point more accurate than 20%, the revenue for the marketing data seller declines by 1%.

So marketing data companies have had a very strong economic incentive to keep their data accuracy between 10% and 20%. The exception to this is a system like Google Search. Buying people who searched for specific keywords is incredibly accurate and Google makes up for the fact that it has a small audience for each keyword by holding an auction for each impression. Because the data is scarce and temporal, Google can command a premium for the data.

Almost all marketing data is less then 20% accurate. Even gender is usually only 75% accurate in most platforms. Accuracy drops dramatically as you move to other standard demos, such as age, income, presence of children or marital status. Even very temporal data sets, like auto intenders, tend to be of very poor quality.

This is true with internet data, mobile data and even old-school mailing list data. Pretty much all marketing data is less than 20% accurate.

The historical reason that data never got better than 20% accurate is that it worked.  Let’s say you bought a mailing list of “motorcycle enthusiasts.” Even if only 15% of people on that list really loved motorcycles, your direct mail campaign may still perform really well.

For instance, if you buy location-based ads on people that have been to Burger King recently, you are going to end up targeting 80% to 90% of people who were driving past Burger King, at the store across the street from Burger King or even 200 meters away from Burger King. The segment has historically still performed because 15% accuracy is a lot better than throwing darts.

Facebook Will Cause Data Companies To See A Reckoning

Facebook is changing everything. If you want to target men on Facebook, it is probably 99% accurate for the humans who see the ads. (There is, of course, still a very big bot problem everywhere on the internet.) And while age, marital status, home location, etc. are not perfect on Facebook, it is an order of magnitude better than the targeting you can get anywhere else, outside of Google Search and a few other players.

Facebook ads perform better than pretty much all other display ads because the data is better. That doesn’t mean you should move all your ad spend to Facebook, as sometimes the ads are more expensive, but it does mean Facebook is putting a lot of pressure on marketing data vendors.

Facebook is allowing more data targeting across many other news, information and properties outside the Facebook walled garden. When that happens on a larger scale, we should see huge shifts in ad budgets.

Measuring Data Quality

One of the biggest problems with marketing data quality is that it is incredibly difficult to measure. It is really hard to have a great truth set that is large enough to measure other data sets. Ideally, you’d want a truth set that has all representative data on 1% of the population, but that is going to be really hard to come by.

Moat, Integral Ad Science, DoubleVerify, White Ops and other companies have done a great job measuring things like viewability and bot traffic. And some companies are starting to do a good job on ad effectiveness and attribution. But there is no company focused on grading and rating marketing data.

For marketing data to survive, we’ll need an independent auditor of data quality. That will need to be a company that only measures and grades – it cannot sell data because no one will trust it if it is both the batter and the umpire. To the best of my knowledge, that company does not exist today, but I would love to invest in a great founder who is starting one.

Follow Auren Hoffman (@auren) and AdExchanger (@adexchanger) on Twitter.

Enjoying this content?

Sign up to be an AdExchanger Member today and get unlimited access to articles like this, plus proprietary data and research, conference discounts, on-demand access to event content, and more!

Join Today!


  1. I think more possible, now, outside a large 1st party data aggregator like FB with the insight, but conflict. Need stable ID, qualified audience and scale. Good luck founder X.

  2. Great article, but it would be great if it contains a table showing the data quality mean by attribute and the source of this.

  3. John Dempsey

    The headline is spot-on. There are far too many data solutions that claim to provide the right audience, but are watered-down to the point of being worthless. I spent my last year at Oracle Data Cloud investigating quality across hundreds of providers offered the DMP and there is quite a range. In my experience, the best ones can be much higher than 20% accurate, but “accuracy” depends on what KPI you’re aiming for. My advice is to never select a data provider without first understanding how these audiences are constructed and what underlying truth sets are used. Don’t expect an idependent data auditor anytime soon–if you’ve got a great truth set, you’ll be selling data too. That’s where the money is. 🙂

  4. warren storey

    Great Article, very well thought out and written.

    I agree that data quality is an issue, I would question your 20% accuracy stat as a defacto truth for all data vendors as I have done allot of work in this space and know for sure that allot of the “traditional” data sources/compilers/originators have accuracy rates >75% when bounced up to the best “truth files” that exist (and I 100% agree that there is dearth of really good/sizable truth files). While I agree that FaceBook ads perform great – as they do – however performance is only relative to the ROI that it provides the advertiser so looking at performance in the absence of cost is not really a great argument. (I can get a great performance rate by sending someone a very expensive premium/incentive –does not mean that it will payout).

    The dream of building a 1% Truth set is a good one – not sure the financials on how to make that work as although it is “important” it would represent “non-working dollars” for the advertisers (which are always hard to come by).


  5. Lots of numbers with no proof points. And accolades for walled gardens without proof of performance. He sold Liveramp to a company that measures its own marketing data against truth sets that seems to work well for them. Let me guess what his next venture will be.