Home Data-Driven Thinking When Evaluating Cross-Device Graph Technology, Look Beyond Match Accuracy

When Evaluating Cross-Device Graph Technology, Look Beyond Match Accuracy

SHARE:

RajivMaheshwariData-Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media.

Today’s column is written by Rajiv Maheshwari, cross-device technology leader at Neustar.

With consumers increasingly accessing content and shopping via multiple devices, multiscreen and cross-device identity have become critical to advertisers.

It offers a unified view of individual consumers as they interact with brands’ advertising across multiple devices and platforms. A unified view of the consumer opens the door to cross-device marketing, multitouch attribution, closed-loop reporting, unique reach and frequency measurement and opt-out compliance, among other desirable capabilities.

Industry giants with large user bases across mobile devices and desktops, such as Apple, Google and Facebook, have a clear advantage because users identify themselves by logging in via a single ID across all platforms. These companies have created so-called “walled gardens” offering deterministic identities at scale, potentially grabbing the lion’s share of advertisers’ spending.

Other companies in the ad tech ecosystem need a formidable alternative solution to compete. Several vendors have emerged over the last few years to fill the void with probabilistic cross-device matching technology that links browser cookies and device IDs to each user.

The vendors’ aggressive marketing has largely driven the conversation toward match accuracy of clustered identities in their “device graph.” Vendors have claimed accuracy ranging from 70% to 97%. But what they are really talking about is precision, incorrectly defined as accuracy.

I’ve had to evaluate several device graph technologies over the last year. I’ve found that, in general, the level of sophistication in currently available solutions on the market is still low compared to other successful applications of machine learning technologies, such as email spam filtering, recommendation engines, face recognition or fraud detection. Here are some of the criteria I’ve learned to consider.

Precision And Recall

Precision is the percentage of clustered identities in the device graph that are truly linked to the same individual. Recall, on the other hand, is the percentage of all existing user identities that are clustered in the device graph.

For example, say a given user has five different IDs across multiple browsers and devices, which I’ll call A, B, C, D and E. If IDs A, B and F – some other user’s ID – are clustered in the device graph, the device graph’s precision is 67%, since two of the three clustered IDs are correct. However, the recall is only 40% since only two of the IDs are correctly clustered out of five total IDs. Lower precision can yield higher false positives, while lower recall indicates higher false negatives.

Subscribe

AdExchanger Daily

Get our editors’ roundup delivered to your inbox every weekday.

Depending upon your target use cases, you may prefer higher precision to recall or vice versa. For example, higher precision is desirable if marketers want to retarget an audience with sequential messaging. Higher recall is desirable if the goal is to increase audience reach by acquiring new screens. Some vendors may also provide the ability to adjust precision vs. recall via a cluster affinity score for IDs. Increasing recall typically also increases scale.

Partial vs. Fully Clustered Cross-Device Identities

Ideally, each cluster in the device graph should have all the IDs linked to the same individual. From our previous example, IDs A, B, C, D, E and F would constitute a full cluster. However, the vendor’s device graph may provide only pairwise or partial clusters with IDs spread across multiple clusters as illustrated by the following ID tuples:

  • [A, B, F]
  • [B, C]
  • [D, E]
  • [B, E]

Many of the cross-device business use cases, such as multitouch attribution, depend on accurately assembling the entire user events chain. It is a lot simpler to stitch together users’ journeys across multiscreen touch points with fully clustered cross-device identities. Assembling user events chains from partially clustered identities is computationally intensive when dealing with billions of user events. Partial identity clusters are only a partial solution to the cross-device identity problem.

Scale Of Clustered Cross-Device Identities

Vendors often tout that they have more than a billion IDs in their device graph. However, what matters from a cross-device perspective is how many of those IDs are clustered. Probabilistic and deterministic matching can only tell which IDs are linked to same individual.

Standalone ID does not necessarily imply that the corresponding user has only one digital ID in the universe. In that respect, standalone IDs in device graph are about as important as IDs that are not in the device graph, meaning they provide no additionally useful information. So although a vendor may have more IDs in its device graph, many may be useless.

Individual And Household-Level Hierarchical Clustering

Finally, if your marketing goals require both individual and household-level granularity, it may be a good idea to ask your vendor if they can support hierarchical clustering of individual-level cross-device identity clusters into households. There are several algorithms that perform hierarchical clustering. Redesigning the algorithms to perform at big data scale is difficult but certainly achievable with currently available technologies.

Untitled

I find it encouraging to see growing interest and momentum in cross-device identity solutions. With several readily available machine learning software libraries and tools, the entry barrier is set fairly low.

On the flip side, there are significant data science and engineering challenges to overcome. A comprehensive solution would also need access to online, mobile and offline identity data points. Hopefully, increased competition will drive innovation in this space.

Follow Neustar (@Neustar) and AdExchanger (@adexchanger) on Twitter.

Must Read

The Arena Group's Stephanie Mazzamaro (left) chats with ad tech consultant Addy Atienza at AdMonsters' Sell Side Summit Austin.

For Publishers, AI Gives Monetizable Data Insight But Takes Away Traffic

Traffic-starved publishers are hopeful that their long-undervalued audience data will fuel advertising’s automated future – if only they can finally wrest control of the industry narrative away from ad tech middlemen.

Q3: The Trade Desk Delivers On Financials, But Is Its Vision Fact Or Fantasy?

The Trade Desk posted solid Q3 results on Thursday, with $739 million in revenue, up 18% year over year. But the main narrative for TTD this year is less about the numbers and more about optics and competitive dynamics.

Comic: He Sees You When You're Streaming

IP Address Match Rates Are a Joke – And It’s No Laughing Matter

According to a new report, IP-to-email matches are accurate just 16% of the time on average, while IP-to-postal matches are accurate only 13% of the time. (Oof.)

Privacy! Commerce! Connected TV! Read all about it. Subscribe to AdExchanger Newsletters
Comic: Gamechanger (Google lost the DOJ's search antitrust case)

The DOJ And Google Sharpen Their Remedy Proposals As The Two Sides Prepare For Closing Arguments

The phrase “caution is key” has become a totem of the new age in US antitrust regulation. It was cited this week by both the DOJ and Google in support of opposing views on a possible divestiture of Google’s sell-side ad exchange.

create a network of points with nodes and connections, plain white background; use variations of green and grey for the dots and the connctions; 85% empty space

Alt Identity Provider ID5 Buys TrueData, Marking Its First-Ever Acquisition

ID5 bought TrueData mainly to tackle what ID5 CEO Mathieu Roche calls the “massive fragmentation” of digital identity, which is a problem on the user side and the provider side.

CTV Manufacturers Have A New Tool For Catching Spoofed Devices

The IAB Tech Lab’s new device attestation feature for its Open Measurement SDK provides a scaled way for original device manufacturers to confirm that ad impressions are associated with real devices.