Relief Is Coming For CPG Marketers Stuck On Data Desert Island

Data-Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media.

Today’s column is written by Ramez Karkar, director of data architecture at Mediavest Spark.

The current data landscape in which CPG marketers find themselves resembles an island without access to clean water. Luckily, supplies may be within their grasp, and the situation is finally starting to change for the better.

Akin to finding fresh water on a desert island, clean data is great when you have it because the value of customer information is huge in today’s advertising world. Customer data allows marketers to garner attribution insights, build lookalike models for prospecting and cross-sell current customers. Most financial, tech, retail and travel brands have reaped the biggest benefits of precision marketing from their access to swarms of first-party data.

Yet most CPG brands have been left behind to target third-party CPG audiences modeled off a small panel of users to establish some sense of precision marketing. It is not that CPG brands don’t desire greater precision; they just don’t own the rights to the “shelf space” in the stores, and therefore have no clean data supply flowing in.

Without owning any shelf space in the stores, CPG brands are forced to accept modeled data from providers who claim that it’s clean. Modeled data might be good for mid- or upper-funnel targeting tactics, but it is not ideal for measurement, lookalike modeling or retargeting. Most CPG data providers are also not willing to share the seed set used for their models, which could be licensed, as they don’t want anyone to know the ingredients and how small their sample size actually is.

More concerning is that data providers make quality claims about their products through the attribution reporting they provide. While working with modeled CPG data is still a step in the right direction toward being more data-driven, compared to the previous tactics of broader demographic targeting, CPG brands’ reliance on data partners is unhealthy and should only be regarded as a temporary solution.

For data-starved CPG professionals, clean supplies should arrive soon. The shipments are coming in small amounts, but newer providers in the form of reward apps or online marketplaces are collecting data on the products users purchase off store shelves or online.

For example, Shopkick, Ibotta and other reward apps provide digital coupons and give cash back when users purchase certain promoted products at the grocery store. Shopkick even rewards users for simply scanning the product’s barcode – essentially giving partial rewards to users to interact with the product on the shelf – which also guarantees viewability. Ibotta links directly with some stores’ rewards programs to help streamline cashback and simultaneously collect more data.

Connexity, formerly Shopzilla, collects valuable online shopping data from Bizrate, PriceGrabber and other sources to provide marketers with highly relevant shopper data and insights at scale.

And let’s not forget Amazon. While many brands may have mixed emotions about fueling Amazon’s online shelf sales, the power of its shopper data is immense. It has pure data on who real customers are at great scale, which can be used for retargeting, lookalike modeling and insights within its programmatic platform – either on or off Amazon properties.

Additionally, there are other companies in this space, including Instacart and MiniBar, that do the in-store shopping for customers; they should also be coming to CPG’s rescue soon.

This should be the year that first-party data finally starts to trickle down from the “stores” to the “shelves” in large quantities as data collection tactics become increasingly sophisticated and vendors capitalize on this large gap in data equality. Everyone can and should have data access.

CPG marketers need to be more mindful of the data they are putting into their data management platforms and media plans and push the industry to get as much access to clean, unmodeled data as possible.

Follow Mediavest Spark (@mvsparkww) and AdExchanger (@adexchanger) on Twitter.

Enjoying this content?

Sign up to be an AdExchanger Member today and get unlimited access to articles like this, plus proprietary data and research, conference discounts, on-demand access to event content, and more!

Join Today!


  1. Kevin

    This article seems to lack information about the current marketing landscape for CPG, as they have some of the most robust, and granular datasets available to them yes even in the 3rd party space. Datalogix now Oracle, Nielsen Catalina, Walmart, 84.51, all leverage broad UPC level transaction data sets for a variety of marketing applications, not panels.

    Its easy to fall in love with the Ibottas, or shopzillas who do great things in a growing online/app based sector, but dont forget that almost 93% of spend still happens offline. CPG I believe is a higher percentage and often cater to a very specific type of persona.

    I agree its important to ask questions of your data provider, what % of national cpg txns do you have covered? How is it skewed by geo, demo, etc? How many are tied to various channels (social, open web, mobile). No dataset is perfect, but the reality is CPG has some of the best datasets available mostly due to banners tying loyalty programs to in store savings – thank you King Soopers cards. I know ad-exchanger isnt the NY times, but some credibility/vetting for guest contributors would be nice

  2. warren

    I 100% agree Kevin – this article is does a terrible disservice and is a complete misrepresentation of the state of data in the CPG space. The fact that not a single major player in the CPG Tx data space is mentioned should be a dead give away that the author really knows nothing about the space (how can you talk about CPG Tx data without mentioning NCS, DLX, Oracle, or retailers directly). There are more amazing sources of CPG Tx data today than ever before – a cursory understanding of the space should be required to write an article.

    • Ramez

      Warren/Kevin – it seems you’ve both become accustomed to drinking soda. I didn’t think I needed to mention NCS and DLX because they are implied by my article. Apologies if that wasn’t clear.

      I’m not saying there’s a ton of perfect alternatives out there that combine scale and precision, but they are definitely arriving and the industry as a whole is moving away from the modeled-out 3rd party players.

      • Kevin

        Ramez – It isnt that you aren’t aware of the players, but that the idea you presented that these are either a) not UPC level and/or b) modeled – is what is incorrect. CPG has some of the best data available to them, again mostly due to widely adopted loyalty programs.

        Both leverage audiences and attribution using large(emphasis on large – not small panels) in-store upc level transaction databases. While there are certainly modeled product offerings, they are derived from offline data stores. Exactly the “fresh water” you portrayed as the only source of this type of info. When in reality the solution you present

  3. Ibotta will introduce the same kind of skew into a data set–its users are young, mobile friendly, etc. Almost half the population fits outside that demo and wouldn’t be represented. Given the choice between cobbling together 15 different apps’ data sets and doing individual buys, I’d rather buy an NCS audience for efficiency’s sake. Agree with Kevin, ask questions and understand the data. Models aren’t perfect, but a 150 million household database for mac ‘n cheese might not be ideal either.

  4. Nielsen. IRI. Comscore. Catalina. Nielsen Catalina Solutions. Datalogix. Transaction data at scale has become a commodity. What was once only every known within the for walls of retail, consumer transaction data, has been sold into the market. Trickle in app data is a nice “augmenting”.

    Worth noting that Catalina Marketing’s media network sees every transaction from every register at every Walgreens, Target, Safeway, Kroger (and more) in real time, every day.

  5. tyler pietz

    the biggest challenge for CPG/FMCG marketers, as i see it, is that most have not invested in sales channels that actually allow for an extrinsic evaluation of their audience data. furthermore, the data they do have access to is either highly fractional (and thus heavily skewed when modeled to scale), or so broad and one-dimensional that it fails to provide any incremental value over a randomly-served impression. solutions such as NCS are probably the best option for most at this point, but the lack of owned data footprint is a major liability for the CPG/FMCG vertical as consumer cohorts become less homogenous and more niche