Bid Log Data: Here, There, Everywhere

“Data-Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media.

Today’s column is written by Tom Triscari, co-founder and managing partner at Labmatik.

Programmatic often feels like a Churchillian riddle, wrapped in a data mystery, inside a $100 billion enigma. Perhaps this is why log data has come in and out of fashion over the last five years, returning again in the past few months.

The hype cycle started in 2014 with Ad/Fin. It felt like Manifest Destiny that all marketers would adopt log data as a necessary programmatic tool.

Then reality set in. Five years later, adoption is stuck in a trough of disillusionment. What appeared to be a no-brainer with plenty of PR buzz never made a dent in the marketers’ priorities list.

As we enter 2020, the second coming of log data has nowhere to go but up from the trough and toward mass adoption. Perhaps when programmatic spending surpasses $100 billion this year, the need for log data accountability and competitive advantage might finally reach a tipping point.

However, marketers’ key questions remain the same as in 2014: What do I gain from log data that I don’t get today? What does it cost? Is it worth it?

Log data = pricing data + impression quality data

Bid logs are the data that describe a single transaction event: the who, what, where, why, when, how much and how many are answered at the micro-level. For example, when buyers log into DSP or verification vendor-reporting consoles, the dashboards, tables and downloadable spreadsheets are aggregated from log-level data.

Standard platform reports provide a high-level view of the forest but not individual trees or leaves, nor can you adjust the lens to study mesophyll cells below the epidermis of the leaf skin or the root structure. Curious marketers interested in knowing what makes the trees and forest function must go micro to correctly interpret the macro.

From an anatomical point of view, impression-level log data comes in two classes: pricing data and quality data. The former comes from DSPs and SSPs, the latter from verification vendors. Every time an impression is bid on, won and served (the roots), a series of tags fire and collect all the data that describes the individual transaction event (cells in the leaves).

The term “log data” refers to pricing data married with ad quality data through a matching process. When pricing and quality data tie the knot, you end up knowing something about what matters most: value.

More importantly, you learn something about relative value. For instance, ad quality data indicates 50% of impressions on Publisher A are viewable. That could be good or bad, but you’ll only be able to draw such a conclusion by knowing the price and comparing it to Publishers B, C and so on.

Once you reach the conclusion, you still don’t know much until you run a log-level query and inspect all impressions on Publisher A. Perhaps Publisher A is pushing bad impressions your way. Or perhaps your DSP’s algo, left to its own devices, buys low- or zero-quality impressions to solve the very optimization puzzle that it was designed for: spend the budget and generate a preset margin goal.

Log data is expensive. No, it’s not.

Marketers routinely hear supply chain agents say log data is “prohibitively expensive” or “costly to run and not necessarily required to identify key issues and performance opportunities.”

Given these positions, the supply chain is essentially daring marketers/procurement to decide if storing and processing log data is expensive or inexpensive and useful or useless. If it is expensive and useless, then who cares? Likewise, even if log data is super cheap but still useless, it’s a nonstarter. On the other hand, what if log data is useful beyond doubt? If so, then how a marketer interprets cost becomes a question of relativity.

Fortunately, marketers need not guess about the cost of storing and processing log data. Just call AWS or ask someone with experience dealing with it. As a rule of thumb, the cost to store, match and process bid logs is around $0.02 CPM or less. Alternatively, you can take a conservative approach by capping it at a nickel CPM thus keeping procurement in safe hands.

Assuming an average $10 media CPM, the variable cost of turning raw data into valuable information is less than 0.5% of media. Any way you cut it, log data costs marketers relatively nothing compared to standard DSP fees, managed service fees, ad serving fees, verification fees and SSP take rates. Moreover, the percentage cost will decrease as CPMs for quality impressions likely increase over time.

Judging usefulness

The litmus test to judge whether log data is useful or not comes down to two related questions: Does log data answer questions that would otherwise be unanswerable, and does the prevailing information generate value by improving future buying and planning cycles?

The first advice I would give to marketers this holiday season is to separate log data use cases into two types: fundamental and advanced.

Advanced use cases are mostly about a concept called expected value bidding (EVB). EVB is when an advertiser, in conjunction with a DSP, uses historical log data to bid the true predicted value of an impression and then aims to minimize the error rate between predicted and actual outcomes. If EVB sounds non-trivial that’s because it is not easy. However, it will most certainly separate winners (value creators) from losers (value destroyers) when competing for scarce high-quality inventory in competitive auction environments.

Marketers should think about EVB like JFK thought about going to the moon in 1962. “We choose to go to the moon in this decade and do the other things, not because they are easy, but because they are hard, because that goal will serve to organize and measure the best of our energies and skills, because that challenge is one that we are willing to accept, one we are unwilling to postpone, and one which we intend to win.”

Marketers can start working on log data fundamental use cases by asking questions to which they want precise answers. For example, to learn how true the exchanges are about declaring first- or second-price auctions, you can compare bid prices to clearing prices. The success or failure of your bid strategy depends on knowing the difference. You can also delve into viewability to gain a massive buying edge or make eye-opening value comparisons by bringing in log data from all those direct publisher buys to inform future buying decisions.

Perhaps the biggest value of launching these iterative Q&A sprints is the journey itself. You will likely find it generates nearly all the value. New paths in your log data forest will be revealed, and if done well, the paths less traveled will make all the difference in 2020 and beyond.

Evolve.

Follow Tom Triscari (@triscari) and AdExchanger (@adexchanger) on Twitter.

Tagged in: