Home Data Oracle Partners With Tapad – Because Probabilistic Vs. Deterministic Data Isn’t An And/Or Sort Of Thing

Oracle Partners With Tapad – Because Probabilistic Vs. Deterministic Data Isn’t An And/Or Sort Of Thing

SHARE:

OracleTapadWhich is better – deterministic data or probabilistic data?

It’s a trick question because the answer is “both.”

Even Google – whose first-party cross-device logged in user base likely trumps anything anyone other than Facebook could muster deterministically – uses a combination of logged-in user data and statistical matching.

With privacy in mind, Google’s recently announced cross-device measurement solution will only use deterministic logged-in data as the truth set to jump-start a probabilistic approach. As Neal Mohan, Google’s VP of video and display advertising, described it at launch in June, “We use people who have signed in to Google accounts on various devices as seed data, and we extrapolate from there.”

Oracle Data Cloud, a behemoth in its own right, is another prime example. The data offering, which Oracle refers to as “DaaS” or “data as a service,” announced an expansion of its partnership with Tapad on Thursday to focus on the nexus between mobile and offline.

The two had been working together for roughly the last year and a half, connecting online to mobile for Oracle’s DMP customers.

Tapad’s ID Graph uses a core of deterministic truth data to train its algorithm, thereby producing more and more accurate cross-device matches over time based on behavioral data. Oracle will integrate Tapad’s data into its own ID Graph as part of what Oracle Data Cloud SVP and GM Omar Tawakol termed a “vast effort to help marketers defragment their databases.”

“The techniques the data cloud focused on initially were deterministic – and we continue to focus on deterministic – but, if you look at the volume of interactions across mobile and across devices, deterministic doesn’t give enough volume yet,” said Tawakol, who was also the co-founder and former CEO of audience data company BlueKai before it was acquired by Oracle in February 2014. “It takes years to get to a high match rate.”

Which is why Oracle takes a three-pronged approach to the thorny challenge of establishing identity.

The first piece of the puzzle is BlueKai, which sources first- and third-party data from websites and directly from brands to build up online profiles of consumers. Datalogix, which Oracle acquired in December 2014, brings the offline sales component through direct relationships with stores. Tapad, which had preexisting partnerships with both BlueKai and Datalogix before the acquisitions, helps provide the linkages between devices and data points – of which there are many to make.

“We use both probabilistic data and deterministic data, and we see pluses and minuses with both approaches,” said Tapad CEO and founder Are Traasdahl. “Think about it: You might have three or four different active email address, different phones numbers and a trail of multiple physical addresses you’ve lived at tied to loyalty cards you forgot to update, as well as shared devices between you and other family members.”

Subscribe

AdExchanger Daily

Get our editors’ roundup delivered to your inbox every weekday.

In other words, deterministic data isn’t accurate by default.

“Garbage in, garbage out,” Tawakol said. “Our data scientists have been hitting me over the head for a while to stop using the terms ‘probabilistic’ and ‘deterministic’ because, from their perspective, it’s all about the confidence factor. Just because people provide deterministic data doesn’t mean that it’s the 100% truth.”

And then there are the environments where logged in user data isn’t available to make a connection.

“You don’t need to use your email address to watch TV or go into a store, so if you want a 360-degree view of your customer, there needs to be a combination of approaches,” Traasdahl said.

Nonetheless, deterministic is Tawakol’s preferred tactic, although he acknowledged the ever-present issue around scale.

Which is where probabilistic matching can come in to boost the reach. It’s a tradeoff, said Luca Paderni, VP and research director at Forrester – “scale at the detriment of accuracy.”

Because “cross-device data matching is [not] powerful enough to do just deterministic alone,” said David McIninch, CRO of performance marketing company Acquisio.

“Despite the fact that deterministic would be the ideal state, the market’s nowhere near achieving this goal,” he said. “Any data-mining being done by players who don’t own the data has an element of probabilistic data, for sure.”

Logic dictates that even Facebook, with its Atlas ad server and its soon-to-be-launched DSP have to engage in a bit of probabilistic gymnastics to make sure that the deterministic matches it’s making are as accurate as they can be.

“The [Facebook] data is usually extremely accurate but, much like email, there isn’t going to be an exact match for all data,” McIninch said. “[Say] someone’s name is not their actual name on Facebook, but they’ve signed in with their email and they’ve correlated data through photos, tags, statuses, etc. That rounds out a probabilistic data set that has a lot of legitimacy.”

So, in light of Facebook’s and Google’s scale and what Forrester’s Paderni called their ability to “deliver pretty high levels of reach against most audiences,” how can anyone else compete?

For one, probabilistic data doesn’t reside in a walled garden as does Facebook’s stash, said Kamakshi Sivaramakrishnan in a previous chat with AdExchanger, observing that with probabilistic techniques, “the consumer identity is not owned by the marketers,” which leads to a lack of openness and data portability.

User behavior is another limitation.

“If we compare Facebook’s mobile users against mobile-only users for Q1 and Q2, their cross-device reach is shrinking,” Sivaramakrishnan said. “This increasing single-device behavior will be a problem for even the ‘800-pound deterministic gorillas’ as they try to solve for cross-device applications, such as attribution.”

Must Read

AdExchanger's Big Story podcast with journalistic insights on advertising, marketing and ad tech

Guess Its AdsGPT Now?

Ads were going to be a “last resort” for ChatGPT, OpenAI CEO Sam Altman promised two years ago. Now, they’re finally here. Omnicom Digital CEO Jonathan Nelson joins the AdExchanger editorial team to talk through what comes next.

Comic: Marketer Resolutions

Hershey’s Undergoes A Brand Update As It Rethinks Paid, Earned And Owned Media

This Wednesday marks the beginning of Hershey’s first major brand marketing campaign since 2018

Comic: Header Bidding Rapper (Wrapper!)

A Win For Open Standards: Amazon’s Prebid Adapter Goes Live

Amazon looks to support a more collaborative programmatic ecosystem now that the APS Prebid adapter is available for open beta testing.

Privacy! Commerce! Connected TV! Read all about it. Subscribe to AdExchanger Newsletters

Gamera Raises $1.6 Million To Protect The Open Web’s Media Quality

Gamera, a media quality measurement startup for publishers, announced on Tuesday it raised $1.6 million to promote its service that combines data about a site’s ad experience with data about how its ads perform.

Jamie Seltzer, global chief data and technology officer, Havas Media Network, speaks to AdExchanger at CES 2026.

CES 2026: What’s Real – And What’s BS – When It Comes To AI

Ad industry experts call out trends to watch in 2026 and separate the real AI use cases having an impact today from the AI hype they heard at CES.

New Startup Pinch AI Tackles The Growing Problem Of Ecommerce Return Scams

Fraud is eating into retail profits. A new startup called Pinch AI just launched with $5 million in funding to fight back.