It’s Not About Data Leakage

The Sell-Sider“The Sell-Sider” is a column written by the sell-side of the digital media community.

Rajeev Goel is CEO of PubMatic, a publisher yield optimization company.

“Data leakage” sounds nasty.

The term has been used to describe what happens when 3rd parties drop pixels on publishers’ websites and gain valuable knowledge about the publisher’s audience – often without the publisher knowing. However, “leakage” is also probably not the best way to describe this phenomenon. Leakage insinuates that data is just somehow slipping away from the publisher. That isn’t what is happening.

A more accurate depiction is that 3rd party companies are plucking bits of the publisher’s data while the publisher’s back is turned. Every publisher has a garden of data that they have cultivated, but many don’t know how to harvest it or even know the value of all of the different data types in their garden. A few very smart companies realized the value of the publisher’s data garden and started inviting themselves over to pluck some data here and there to add to the media they’re selling independently of the publisher.

Well, that method turned out to be quite profitable for those demand-side experts and the practice increased and similar companies proliferated. Now many publishers are turning around and seeing that their garden has a lot of footprints from other companies in it, and they’re not happy about it.

The natural reaction for many publishers is to build a fence around that garden to stop everyone else from touching it. But before publishers jump to conclusions, I’d suggest learning more about the pros and cons of having 3rd parties leverage your data because the issue shouldn’t be centered around stopping “leakage,” it should be about publishers gaining greater insight and greater control.

In order for publishers to understand what is happening with their data and gain greater control of it, they need 3 things:

1. Transparency Into All 3rd Parties Collecting Data

Those companies that became experts in selling audience based impressions faster than the publishers have actually helped to increase the value of the publisher’s non-guaranteed inventory in ways the publishers have been unable to do themselves. Advertisers wanted to buy audience based impressions and publishers couldn’t meet that demand, so these demand-side experts found an opportunity to satisfy advertiser needs.

A need for those demand-side experts will continue to exist, but the publisher absolutely needs the ability to see what data is being collected from them, and by what companies. Or going back to the garden analogy, those 3rd party companies should come through the front door rather than sneaking into the garden when the publisher’s back is turned.

2. The Ability to Evaluate The Value Trade-off From 3rd Parties Collecting Data

Once publishers see who is in their data garden, they need the ability to understand whether that company is doing more good than harm. If Ad Network X is collecting massive amounts of data from a given publisher’s data garden and that publisher is not seeing corresponding eCPMs from the ad network, the publisher should re-evaluate the relationship. On the other hand, if Ad Network X is collecting a lot of data and the publisher can clearly see that Ad Network X is paying them significantly higher prices than companies not collecting data, the publisher may very well choose to let them keep collecting data.

In order to see the value that the company is bringing the publisher, the publisher needs to see not just WHO is collecting data, but also the PRICING that the company is paying the publisher.

3. Ability to Sell Audience Independently of 3rd Parties

Last but not least, the publisher holds the greatest amount of control if they are able to sell their audience against their own guaranteed inventory via their direct sales force. Having this ability will not only dramatically increase the value of the publisher’s guaranteed inventory, but it will also free publishers from reliance on 3rd parties to sell audience on their behalf.

In short, it isn’t that publishers need to stop “data leakage,” because their data isn’t leaking. What publishers’ need is more control over how their data is being used. And that will come as a result of having total transparency into who is collecting data, having the ability to evaluate the value trade-off of companies collecting their data, and having the ability to sell their own audience in the same way the demand-side experts have been doing so successfully.

These are exciting times in the online advertising space and the potential for publishers to maximize ad revenue by gaining control of their audience data is key to driving profitability in the coming years. Any publisher strategy for maximizing the value of audience data should start with transparency and the value trade-off question.

Follow Pubmatic (@pubmatic) and (@adexchanger) on Twitter.

Enjoying this content?

Sign up to be an AdExchanger Member today and get unlimited access to articles like this, plus proprietary data and research, conference discounts, on-demand access to event content, and more!

Join Today!


  1. Garden plucking is a nice way of looking at things. I think of it more as a gold rush. Right now, there’s a bubble occurring until the publishers figure something out. Until then, some folks are going to really profit off of this data collection–meanwhile, privacy issues loom over our industry and government interference is on the horizon. I just hope that this data gold rush doesn’t exacerbate the situation.

  2. Varoujan Bedirian

    Great analogy, Rajeev.

    When publisher is monetizing user data, it’s not data leakage, but data sales. Although even that opportunity is upper-bound by the publisher’s privacy policy and perhaps as importantly the regulatory climate.

    Nonetheless, for those publisher data elements that have comparable segments in data exchanges, a publisher can price its own data based on open-market levels with extra consideration given to (among others):
    – Size of publisher’s data supply compared to market’s (seller power).
    – Accuracy in publisher’s data (higher ROI for buyer).
    – Incremental value of selling data already bundled with inventory (less work for buyer).

    With pricing to known 3rd parties taken care of, there remains one major check. To regularly monitor that sanctioned 3rd parties are not sneaking in unwelcome 4th party guests to the garden. And if they are that the newcomers announce themselves, pay the entrance fee, and tread carefully around the manicured lawn.