Significant Investment Needed To Unlock The Potential Of Publisher First-Party Data

Paul Bannister headshot

The Sell Sider” is a column written by the sell side of the digital media community.

Today’s column is written by Paul Bannister, co-founder and executive vice president at CafeMedia.

The world is, rightfully so, consumed with the pressing issues surrounding the spread of COVID-19 and economic fallout. That doesn’t change the fact that many other large changes we were collectively focused on before the global pandemic struck are moving forward regardless.

One such major change is the deprecation of third-party cookies in Google Chrome ­– still slated for early 2022. There are many paths being researched: Chrome’s “privacy sandbox,” logged-in users and more.

Publishers are already sitting on one of the highest potential goldmines of data to fill the cookie void: publisher first-party data, including contextual data, registration data and on-site behavioral data.

However, as I’ve discussed on Twitter with others, publisher first-party data has been highly underused. Primarily, it’s only been available to buyers with direct relationship with publishers through private marketplaces and direct deals. Significant investments will be required before advertisers can use publisher first-party data at scale.

Improve publisher first-party data

The first opportunity for improvement is scaling publisher first-party data. Buyers are used to easily selecting data from buying tools such as demand-side platforms (DSPs) or on the open web, and P1PD would need to be just as easily accessible and scalable. Otherwise, buyers will use tools such as Facebook and Google, which do that in spades. Publisher first-party data across a handful of publishers isn’t useful to many buyers, and if buyers can’t get the size of segments they’re accustomed to, they’ll buy with other tools or in other channels.

On the flip side, in a world without third-party cookies or cross-site tracking, the data would only be usable on the domains of publishers that classify the data so that it’s not tracking users across the web and publishers’ data isn’t being leaked.

The second key area of development for publisher first-party data is connected to the first: a unified taxonomy. To make segments scalable, all publisher data must be accessible via the same taxonomy, and that taxonomy needs to be both extremely detailed and very easy to update. The IAB’s content taxonomy and Data Label initiatives are a good start, but they need significant expansion (both topically and to allow for non-content data, since the IAB’s current taxonomy only covers context) and clear controls to allow for near real-time updates.

As an example, there are 13 subcategories in the IAB’s current taxonomy for “food and drink.” Facebook has hundreds of food-related categories, probably thousands. And since categories change quickly over time – coronavirus wouldn’t have been a category just a few months ago – the taxonomy needs to be easily updated and agreed upon by different parties.

There also needs to be a standardized process for how data is classified across publishers. A single publisher may classify a certain data point in one part of the taxonomy, whereas another would classify it elsewhere. That lack of consistency will make publisher first-party data far less appealing to buyers who want to know that when they’re buying a given segment, they’re getting consistent and clear data. There need to be rules to ensure that the data has the same meaning no matter where it is used.

The next generation of publisher first-party data

Trust is a huge component of using publisher first-party data at scale, and this has three important perspectives. First, buyers must be able to trust that the data is valid and truthful, otherwise it isn’t useful. Second, publishers must be able to trust that their data isn’t being leaked to hundreds of companies that can use it however they want; the data must be highly secured. And most importantly, users must be able to trust that they can easily opt out of data-sharing and the data is not used for any purposes other than what they have agreed to with a given publisher.

These challenges can be addressed but will require significant coordination across the industry and investment in standards, governance and technology. Standards would ensure that the taxonomy and processes are consistent across publishers and buying platforms. Governance is critical; all parties must follow a code of conduct to ensure that everyone can be trusted, with clear enforcement and ejection from the system for those who misbehave. And finally, we must invest in technology that can manage these systems and processes, and enforce the security and privacy required.

Interestingly, ad exchanges are one of the most logical places for much of these systems to exist. Their businesses scale across thousands of publishers, they have the technical skills to implement complex systems, and their interests are broadly aligned with the interests of publishers. Existing data and contextual targeting platforms could also play a part in a potential publisher first-party data ecosystem.

If we don’t collectively try to solve these issues, targeting on the open web will be less useful than in the walled gardens and the open web will continue to lose ground to the platforms. Publisher first-party data can be a highly valuable part of the programmatic toolset in the future. But it requires collaboration to get us to that outcome.

I’m in. Are you?

Follow Paul Bannister (@pbannist), CafeMedia (@CafeMedia_) and AdExchanger (@adexchanger) on Twitter.

Enjoying this content?

Sign up to be an AdExchanger Member today and get unlimited access to articles like this, plus proprietary data and research, conference discounts, on-demand access to event content, and more!

Join Today!

1 Comment

  1. Andrew Kraft

    Great article, Paul. I agree with you – all those elements are critical. The key is to push an active integration and /communication/ of that integration across Rearc, OpenRTB, and TCF initiatives – if not more – so that we all have comfort that they aren’t just talking together, that they aren’t just working in parallel… but that they’re actually getting things done. Work by other groups (’s efforts on identity and 1st party data) as well as what great vendors are doing in the space… they need to all come together so we can solve the problem /together/ rather than in pieces.