“Data-Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media.
Today’s column is written by Tim Abraham, director of data platforms at Adbrain.
Digital channels have provided a great return in dollar value and consumer insights. The emergence of cross-device solutions measurement has become a priority, and rightfully so, but it remains a poorly understood concept.
In the context of cross-device advertising, there’s a fixation on the accuracy of probabilistic solutions compared to the login-based matches produced by the likes of Facebook. Marketers typically think of accuracy in terms of cross-device technology being able to correctly attribute cross-device behavior to a single consumer, something Facebook does well.
The rise of competing cross-device solutions has created confusion around this idea of accuracy and, at worst, it is purposely misused. It’s about time we move past this simplified idea of probabilistic vs. deterministic cross-device solutions and focus on metrics that reveal a different story.
Accuracy is a metric that doesn’t necessarily mean what people think it does. In the context of cross-device identification, accuracy is calculated as the number of matches correctly identified, as well as number of non-matches correctly identified. In other words it’s the number of times a probabilistic prediction was correct but also includes “non-match” predictions from the total pool of predictions it made.
Marketers don’t care very much about the non-match predictions because they want the predictions of correct device matches. But there will actually be many more non-matches than correct matches so this massively skews the accuracy score, making it look much better than it really is.
For example, if you wanted to identify two people who are related to one another and you took a random sample of a crowd, there is very little chance any two people would be related, but there is a very high chance they will not be related. If you’re looking for matches, but all you end up with is non-matches, that knowledge isn’t very valuable because you haven’t actually found what you’re looking for.
Why Precision And Recall Matter
Recall and precision are relevant metrics that deserve attention. Recall is the number of matches that can be correctly identified out of the total cross-device matches that truly exist. A high recall means that a solution has predicted a large number of the matches that exist, but tells us nothing about the number of matches predicted that didn’t actually exist.
Optimizing toward recall means you get a mixed bag to speak, but this ultimately produces a larger pool of matches. Precision is the number of matches that are actually correct out of the total numbers of matches believed to be correct. This is what most people incorrectly think of as accuracy.
The Perfect Mix?
In practice cross-device metrics for probabilistic solutions can be thought of as a pulley system where improving one metric often means making a trade-off between the others. This is particularly true for recall and precision.
Depending on the application, precision might not be all you’re after if, for example, the goal is to achieve scale, in which case precision can be traded for higher recall. Scale is a major selling point for media buyers and it’s likely that finding a sweet spot between precision and recall will actually yield better results than a highly precise but inherently limited reach.
The focus on accuracy is really just scratching the surface. By looking at different metrics, such as precision and recall, marketers can effectively exceed their goals. Accuracy may, in fact, play just a small part in that.