Home Data Privacy Roundup Getting Practical – But Not Personal – With Differential Privacy

Getting Practical – But Not Personal – With Differential Privacy

SHARE:
Comic: At The Privacy Diner

Being able to share information about a group of people without compromising any individual person’s privacy kinda sounds like a form of wizardry.

But it’s not. It’s just math.

I say “just” not to downplay how technical the process is, but rather to highlight that privacy-enhancing technologies (PETs) – like differential privacy, as described above – are no longer abstract. They’ve entered the mainstream.

“Differential privacy went from a theoretical concept, something you’d think about in grad school, to ‘Oh, OK, I can get a job doing this outside of just being a professor,” said Ryan Rogers, a mathematician who’s now on his second stint at Apple.

Rogers rejoined Apple in March as a machine learning researcher with a focus on data and privacy after a little more than five years at LinkedIn as tech lead on differential privacy (DP).

His career trajectory – from an applied math and computational sciences PhD at UPenn to Apple to LinkedIn and back to Apple again – represents the broader shift that PETs have made from the classroom into practical research labs across Silicon Valley.

PET projects

Over the past eight or nine years, all the big platforms have invested in operationalizing differential privacy: Apple, Facebook, Google, Snap, LinkedIn and even TikTok.

  • Apple, for example, uses DP to draw inferences about user behavior on its devices to power its recommendations, like which emojis are trending or new popular words appearing in texts.
  • Facebook used it to make large data sets available to researchers in 2020 who were studying what impact sharing misinformation has on elections.
  • Google has used it to support some of the APIs in the Chrome Privacy Sandbox and to gather data for training neural networks.

Meanwhile, LinkedIn has been using differential privacy to measure the real-time performance of posts. This method allows content creators on the platform to see what’s resonating with certain audiences but without getting access to any specific demographic information, such as who exactly saw a post.

The goal in each of these cases is to strike the often-tricky balance between privacy and utility.

“There’s been a realization in the industry that things we’ve done in the past, like aggregation, might no longer be sufficient for protecting privacy,” Rogers said.

Not that differential privacy – or any PET, for that matter – can achieve perfection. Perfect privacy is only possible if no data is shared at all, and if nothing is shared, there is no utility.

“Think of differential privacy as existing on a spectrum,” Rogers said.

In other words, there has to be a tradeoff.

Adding more statistical noise or randomness to a data set means the privacy guarantee is stronger, but the output will likely be less accurate, and vice versa. The ratio depends on your risk tolerance level, what you’re trying to achieve and the sensitivity of the data set in question.

PEDAL to the privacy

LinkedIn’s investment in differential privacy for post analytics was about being proactive rather than reactive to risk.

One logical way to protect the privacy of someone who views a post is to only share aggregated information with the post’s author, like the top job title among viewers or a company name.

But LinkedIn’s applied research team wondered whether it would be possible for a bad-acting author to combine that information and monitor real-time updates to profiles on LinkedIn as a way to identify exactly who engaged with a post.

Although LinkedIn had never seen an attack like that happen in the wild, the team, helmed by Rogers at the time, decided to dig in and find out whether a risk really existed.

And, apparently, it did. They discovered it was technically possible to identify around 9% of post viewers using a small amount of demographic information.

The upshot of this research was the development and release of a privacy tool late last year called PEDAL, which stands for Privacy-Enhanced Data Analytics Layer.

If you’re a data scientist or some other variety of math-minded brainiac, you can dive into the details here. But I’m neither, so, in short, what PEDAL does is to apply multiple differential privacy algorithms to inject noise into event-level data before it’s shared with LinkedIn’s analytics platform.

The upshot is that the people viewing LinkedIn posts can’t be identified – but the person posting them can still get useful analytics instantly. Balance = struck.

“With differential privacy, you can still get useful insights from data without revealing anything at the individual level,” Rogers said. “The point here is to be as practical as possible.”

🙏 Thanks for reading (wherever you happen to be doing so; this is a judgment-free zone)! As always, feel free to drop me a line at allison@adexchanger.com with any comments or feedback.

Must Read

AI Is Redefining Premium Content – Which May Not Be A Good Thing

At AdExchanger’s Programmatic AI conference, media experts discussed how the rise of AI-generated content is changing the industry’s understanding of “premium” content.

The Big Story Podcast

Prog AI Live: AI’s Slippery Slop

Recorded live in Las Vegas at Prog AI, the AdExchanger team tackles a tricky question: As AI floods the feed with chaotic, addictive content and people engage with it, what does “premium” even mean anymore?

The Programmatic Auction Is Changing In Real Time – Here’s How

Two decades after the first RTB auction, programmatic is more complex than ever – and that’s before you even consider generative AI.

Privacy! Commerce! Connected TV! Read all about it. Subscribe to AdExchanger Newsletters

Publicis Acquires LiveRamp In A Major Shakeup For Indie Data Collaboration

Hundreds of exasperated and unexpected ad industry phone calls were made on Sunday, as agencies and ad tech vendors discussed the fallout of Publicis Groupe’s $2.2 billion acquisition of LiveRamp over the weekend.

Finger connecting dots on a cork board network concept

These AI Agents Want To Handle All The Annoying Parts Of Media Buying

Meet Kovva, a new AI ad tech startup tackling the unglamorous gruntwork that programmatic has never fully automated.

Felipe Cuevas for TelevisaUnivision

We Went To Eight Upfronts This Week. Here's What We Learned

Upfront week is officially over. In case you missed any of the dog-and-pony shows — including Chappell Roan belting out “Pink Pony Club” during YouTube’s Broadcast — don’t worry; we’ve got you covered.