Home Data-Driven Thinking When Evaluating Cross-Device Graph Technology, Look Beyond Match Accuracy

When Evaluating Cross-Device Graph Technology, Look Beyond Match Accuracy

SHARE:

RajivMaheshwariData-Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media.

Today’s column is written by Rajiv Maheshwari, cross-device technology leader at Neustar.

With consumers increasingly accessing content and shopping via multiple devices, multiscreen and cross-device identity have become critical to advertisers.

It offers a unified view of individual consumers as they interact with brands’ advertising across multiple devices and platforms. A unified view of the consumer opens the door to cross-device marketing, multitouch attribution, closed-loop reporting, unique reach and frequency measurement and opt-out compliance, among other desirable capabilities.

Industry giants with large user bases across mobile devices and desktops, such as Apple, Google and Facebook, have a clear advantage because users identify themselves by logging in via a single ID across all platforms. These companies have created so-called “walled gardens” offering deterministic identities at scale, potentially grabbing the lion’s share of advertisers’ spending.

Other companies in the ad tech ecosystem need a formidable alternative solution to compete. Several vendors have emerged over the last few years to fill the void with probabilistic cross-device matching technology that links browser cookies and device IDs to each user.

The vendors’ aggressive marketing has largely driven the conversation toward match accuracy of clustered identities in their “device graph.” Vendors have claimed accuracy ranging from 70% to 97%. But what they are really talking about is precision, incorrectly defined as accuracy.

I’ve had to evaluate several device graph technologies over the last year. I’ve found that, in general, the level of sophistication in currently available solutions on the market is still low compared to other successful applications of machine learning technologies, such as email spam filtering, recommendation engines, face recognition or fraud detection. Here are some of the criteria I’ve learned to consider.

Precision And Recall

Precision is the percentage of clustered identities in the device graph that are truly linked to the same individual. Recall, on the other hand, is the percentage of all existing user identities that are clustered in the device graph.

For example, say a given user has five different IDs across multiple browsers and devices, which I’ll call A, B, C, D and E. If IDs A, B and F – some other user’s ID – are clustered in the device graph, the device graph’s precision is 67%, since two of the three clustered IDs are correct. However, the recall is only 40% since only two of the IDs are correctly clustered out of five total IDs. Lower precision can yield higher false positives, while lower recall indicates higher false negatives.

Subscribe

AdExchanger Daily

Get our editors’ roundup delivered to your inbox every weekday.

Depending upon your target use cases, you may prefer higher precision to recall or vice versa. For example, higher precision is desirable if marketers want to retarget an audience with sequential messaging. Higher recall is desirable if the goal is to increase audience reach by acquiring new screens. Some vendors may also provide the ability to adjust precision vs. recall via a cluster affinity score for IDs. Increasing recall typically also increases scale.

Partial vs. Fully Clustered Cross-Device Identities

Ideally, each cluster in the device graph should have all the IDs linked to the same individual. From our previous example, IDs A, B, C, D, E and F would constitute a full cluster. However, the vendor’s device graph may provide only pairwise or partial clusters with IDs spread across multiple clusters as illustrated by the following ID tuples:

  • [A, B, F]
  • [B, C]
  • [D, E]
  • [B, E]

Many of the cross-device business use cases, such as multitouch attribution, depend on accurately assembling the entire user events chain. It is a lot simpler to stitch together users’ journeys across multiscreen touch points with fully clustered cross-device identities. Assembling user events chains from partially clustered identities is computationally intensive when dealing with billions of user events. Partial identity clusters are only a partial solution to the cross-device identity problem.

Scale Of Clustered Cross-Device Identities

Vendors often tout that they have more than a billion IDs in their device graph. However, what matters from a cross-device perspective is how many of those IDs are clustered. Probabilistic and deterministic matching can only tell which IDs are linked to same individual.

Standalone ID does not necessarily imply that the corresponding user has only one digital ID in the universe. In that respect, standalone IDs in device graph are about as important as IDs that are not in the device graph, meaning they provide no additionally useful information. So although a vendor may have more IDs in its device graph, many may be useless.

Individual And Household-Level Hierarchical Clustering

Finally, if your marketing goals require both individual and household-level granularity, it may be a good idea to ask your vendor if they can support hierarchical clustering of individual-level cross-device identity clusters into households. There are several algorithms that perform hierarchical clustering. Redesigning the algorithms to perform at big data scale is difficult but certainly achievable with currently available technologies.

Untitled

I find it encouraging to see growing interest and momentum in cross-device identity solutions. With several readily available machine learning software libraries and tools, the entry barrier is set fairly low.

On the flip side, there are significant data science and engineering challenges to overcome. A comprehensive solution would also need access to online, mobile and offline identity data points. Hopefully, increased competition will drive innovation in this space.

Follow Neustar (@Neustar) and AdExchanger (@adexchanger) on Twitter.

Must Read

Comic: Alphabet Soup

Buried DOJ Evidence Reveals How Google Dealt With The Trade Desk

In the process of the investigation into Google, the Department of Justice unearthed a vast trove of separate evidence. Some of these findings paint a whole new picture of how Google interacts and competes with its main DSP rival, The Trade Desk.

Comic: The Unified Auction

DOJ vs. Google, Day Four: Behind The Scenes On The Fraught Rollout Of Unified Pricing Rules

On Thursday, the US district court in Alexandria, Virginia boarded a time machine back to April 18, 2019 – the day of a tense meeting between Google and publishers.

Google Ads Will Now Use A Trusted Execution Environment By Default

Confidential matching – which uses a TEE built on Google Cloud infrastructure – will now be the default setting for all uses of advertiser first-party data in Customer Match.

Privacy! Commerce! Connected TV! Read all about it. Subscribe to AdExchanger Newsletters
In 2019, Google moved to a first-price auction and also ceded its last look advantage in AdX, in part because it had to. Most exchanges had already moved to first price.

Unraveling The Mystery Of PubMatic’s $5 Million Loss From A “First-Price Auction Switch”

PubMatic’s $5 million loss from DV360’s bidding algorithm fix earlier this year suggests second-price auctions aren’t completely a thing of the past.

A comic version of former News Corp executive Stephanie Layser in the courtroom for the DOJ's ad tech-focused trial against Google in Virginia.

The DOJ vs. Google, Day Two: Tales From The Underbelly Of Ad Tech

Day Two of the Google antitrust trial in Alexandria, Virginia on Tuesday was just as intensely focused on the intricacies of ad tech as on Day One.

A comic depicting Judge Leonie Brinkema's view of the her courtroom where the DOJ vs. Google ad tech antitrust trial is about to begin. (Comic: Court Is In Session)

Your Day One Recap: DOJ vs. Google Goes Deep Into The Ad Tech Weeds

It’s not often one gets to hear sworn witnesses in federal court explain the intricacies of header bidding under oath. But that’s what happened during the first day of the Google ad tech-focused antitrust case in Virginia on Monday.