What will be the fate of probabilistic data in a world populated by Facebook and Google?
Steve Glanz, CEO of probabilistic cross-device data provider Crosswise, admits that deterministic data is superior to probabilistic connections – but his answer to that question is still yes because of one major factor: the need for scale.
“Obviously, it would be best – not for us, of course, but best for marketers and advertisers – if there were 100% deterministic data solutions available to everyone – but no one has come close to building a deterministic solution with true scale, even Facebook and Google,” he said. “We’re constantly disappointed at the number of users logged in across devices. It’s relatively small, even at the big retailers. Only a small percentage of users do it.”
But that’s not how AOL CTO Seth Demsey sees the future playing out. He anticipates a near-term evolution in the market that will make the line between ad tech and martech “extraordinarily blurry.”
“No matter how high you get your probabilistic scores, the winds I’m feeling with clients, whether direct or agency, is that deterministic matching and scale is what will win here,” Demsey told AdExchanger. “We’re going to see significantly increased instances of sophisticated CRM onboarding and CRM-driven execution, and probabilistic doesn’t play there. There is no such thing as probabilistic onboarding.”
But Yosha Ulrich-Sturmat, VP of product marketing at Neustar – and a client of Crosswise – predicts a world in which deterministic and probabilistic will coexist, if not in harmony, then at least out of necessity.
“Neustar comes from a place where we have the luxury to be able to build predictive linkages,” he said. “But there are limits to what deterministic can do, and I believe you need to augment deterministic data with probabilistic linkages in order to see success.”
That’s what forms the basis of Neustar’s relationship with Crosswise, whose client list of more than 25 DSPs, DMPs, attribution providers and analytics companies also includes The Trade Desk, Undertone, Turn, RadiumOne, Marin Software, Eyeview and Digilant.
Crosswise prefers to be behind the scenes.
“If you’re looking for an end-to-end solution that includes media and reporting, we’re not the company for you,” said Glanz.
Crosswise, which Glanz expects to break even within the year and be profitable “relatively soon,” made its probabilistic data product generally available in the US late in 2014 after first launching a controlled test phase last July in the New York market. It rolled out its UK business in May.
Since then, Crosswise’s staff of 25 – the majority of whom are based in Tel Aviv and focused on tech and data science – has been training and validating the company’s algorithms using a data set of more than 100 million deterministic pairs which it accesses through several partnerships, including a close one with LiveRamp. Crosswise also uses LiveRamp to distribute data to many of its clients.
Tapad has a similar approach with its algorithm, using deterministic data to teach its device graph to create more accurate probabilistic cross-device connections over time.
But unlike Tapad, Crosswise doesn’t sell media, and that’s by design. It’s a neutrality thing, said Glanz. Rather than courting the agencies and advertisers themselves, Crosswise does business with the DSPs those agencies and advertisers work with to execute their campaigns.
Neustar is a good example. As a provider of marketing and information services, Neustar maintains its own deterministic data set based on a foundation of around 120 million US households comprised of 200 million adults tied through anonymized transactional data to 180 million devices.
It’s quite a large repository, but Neustar still needs to augment its deterministic data with probabilistic linkages, said Ulrich-Sturmat.
Neustar onboards data from Crosswise and several other cross-device technology players – Ulrich-Sturmat declined to name which ones – to extend its segmentation capabilities and help build the “identity layer” that fuels the work done inside PlatformOne, Neustar’s workflow, insights and analytics solution.
“We call it the network effect of data,” said Ulrich-Sturmat.
The data that comes on board from Crosswise arrives in the form of a massive file with three columns: one for cookie ID or device ID, another denoting the cross-device connection and a third designating a confidence score. The first row, for example, might say, “desktop cookie, Android ID, 50%,” the next row, “mobile web cookie, iPad, 80%,” the row after that, “Android ID, iPhone IDFA, 90%” and so on down the line.
But the examples above don’t necessarily represent separate people. It’s entirely possible that columns A, B and C correspond to different connections for the same person, someone who owns a Samsung phone, an iPad and an iPhone.
While some clients ask Crosswise to stitch these disparate connections together into a single person before they receive the file, Glanz said Neustar prefers to receive its data “in the raw,” so to speak.
And there’s a reason why. Cross-device is a two-pronged game pitting accuracy against scale. Both of those factors need to win in order for the advertiser to get value. If column A and column C are the same person, but column A is 50% accurate and column C is 90% accurate, the accuracy of the combined connection is weaker than the accuracy of column C on its own.
Once a campaign has been deployed, advertisers also want to track performance and optimize accordingly.
“We need to help advertisers answer the question, ‘How did I do?’” said Ulrich-Sturmat. “And to do that, we need an holistic view of devices combined with our analytics layer.”
Back to the question of accuracy, although Crosswise isn’t yet verified by Nielsen like Tapad and Drawbridge – 91.2% and 97.3%, respectively – Glanz said that the company’s matches are 90%-plus accurate. But Crosswise, Glanz said, also provides “matches with lower confidence scores which allow the customer to choose what works best for them.”