“DataDriven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media.
Today’s column is written by Jay Friedman, chief operating officer at Goodway Group.
I thought the days when companies promised awesome algorithms without offering any details were over. Companies like AppNexus or DataXu are transparent with their algorithms, which are used to optimize digital media. Yet I still hear others pitch “black box” algorithm curealls.
While it’d be great to put this debate to bed, it’s not that easy. While researching this piece, for example, I found that for every data scientist who tells you the right way to solve how to identify the best way to develop a performance algorithm, three others claim the equation is wrong and propose something different.
To shed light on the algorithm debate, I’ve laid out a few different perspectives below that are meant to be common sense. I’m not a data scientist, but you shouldn’t have to be one to understand what you’re being sold. This is for you, the media buyer, the clientside marketing executive trying to make sense of big data, and the media sales rep who wants the complex broken down into something simpler.
For starters, it’s important to understand that any algorithm must have enough data about a given combination of variables to decide its value. For instance, you wouldn’t take a poll of just one person and project the national presidential election because the sample size would be too small. In a race with at least two presidential candidates, you’d need a sample size of 1,067 according to this sample size calculator, if you assumed there are 180 million registered voters and wanted a margin of error of 3% and a 95% confidence level. This is sufficient because the top candidate may be favored by 51%, and the other 46%, with some unknowns. With the 3% margin of error, even if this poll were taken hundreds of times, the candidate with 51% would receive between 48% and 54% of votes 95% of the time. Margin of error works on a bell curve like this, which assumes a 95% confidence level:
OK, maybe math wasn’t your favorite subject, but don’t be intimidated. To read the above, just note that the curve “bunches together” more with a lower margin of error. This just means there’s a much better chance your figures are accurate.
But what if there were 500 viable candidates and none were heavily favored? The top candidate gets 0.8%, the lowest 0.02%. With so many bunched together, even the 2% curve above leaves us uncertain how truly separated the candidates are. Therefore, you might need to increase the sample size.
Here’s how this translates to a digital display or videoadvertising program (mobile is a bit different).
Five Hundred Candidates or 3.6 Billion Value Combinations
In a typical RTB campaign, 50,000 impressions that ran on a random news website generated a 0.1% conversion rate versus a goal of 0.08% conversion rate. This equates 50 total conversions. However, if you break it down further, you see 48 of the 50 conversions occurred between 7 a.m. and 10 a.m. Within those 48, 35 occurred on a Monday. Within those 35, 27 were on Windows 7 operatingsystem (OS) machines. You can see how quickly this unfolds, and it could go on further by adding in more variables.
The key takeaway here is this random news website isn’t necessarily a great site as a lone variable. It’s good at certain times, on certain days, with certain other variables applied.
How many variables and outcomes do you need to take into account when examining a buy? The following are conservative values that actually hurt my case.
Days of Week 
7 

Hours of Day 
24 

Browsers 
6 

Devices 
3 

OS 
4 

Sites 
10,000 

Ad Sizes 
3 

Data Segments 
10 

Creative Treatments 
1 

Total Unique Combinations 
3,628,800,000 

That’s right: More than 3 billion unique combinations can be taken into account – and this is conservative. It’s probably more like 50,000 sites, 20 data segments, and so on, which would make the number much larger.
Being an advertising major and not formally trained in statistics, I consulted two professional statisticians. These gentlemen advised on some of the techniques below to ensure I followed best practices within their industry.
There are two ways to look at this: We can “project forward” for sample size, as in the earlier presidential poll example, or “look backward,” since this is a scenario where we already have data, assuming the buy has already ran. When projecting forward, due to so many unique combinations, it’s likely the performance of millions of combinations will bunch up within 0.001% of each other. Going back to that sample size calculator, with an Internet population of 214 million, to get down to a 0.001% margin of error, your sample size now needs to be more than 209 million. That’s a lot to “sample” before knowing what performs and what doesn’t. But this really doesn’t feel right. So let’s “look backward” instead.
To look backward, let’s determine how many “observations” or impressions we need per unique combination of values to derive a statistically valid and confident decision. Per a whole lot of amazingly sleepinducing Internet chat forums on the subject, there are some instances where 10 observations will suffice and other instances where 30 or 40 will be considered reasonable. Even if 10 observations or impressions are enough, are you running 36.2 billion impressions per flight? This certainly won’t work, so maybe it’s time to give up on the notion of understanding every unique, detailed combination.
Just Be Better Than a Human with Excel
Yes, the perfect algorithm should theoretically explore every combination within the variables. But the example above proves this too unlikely, and no algorithm is perfect. Conversely, we don’t need an algorithm that only looks at a single variable. A human could do that with the “sort” function in Excel. Going back to our random news website, let’s say the algorithm looks at just two variables at a time, such as site and data segment, browser and hour of day, or site and day of week. We could argue certain variables are more important than others, but we’re talking magic in a box here. Surely it can calculate any two variables at a time.
To do this, we need the total number of individual pairs among the values in these variables. Cutting the number of sites down to 1,000 to further prove the point, I’d love to tell you I know the formula to figure this out, but three minutes in Excel multiplying each column out gave me 59,284 unique pairings.
You’ll remember some stats folks suggesting 10 observations or impressions per combination would be enough. Would you optimize anything off of 10 impressions? Even 100? Since we’re trying to be more realistic but still conservative, we’ll use 1,000 impressions per combination of values. Now we’re up to 59,284,000 impressions needed to get good data across all twovalue pairs. Use a more realistic threshold like 5,000 per combination, and we’re up to more than 295 million. How many of you are running this type of buy, with one vendor, in one flight?
Frequency
Rather than looking at all of the media variables mentioned above, it might be easier to pivot our viewpoint and look at users instead. This would suggest the algorithm is going to optimize against users and not the media variables like site, time of day and so on. To do so this, we need to look at frequency. Going back to the notion of “observations,” research shows us 10 is actually an OK number. We’ve looked at thousands of campaigns and seen that a monthly frequency of eight to 12 is needed before we see results diminish in efficiency. OK, time out: It’s commonsense gutcheck time.
If you need roughly 10 impressions per user before you know whether or not to optimize that user in or out of the buy, you’ve also served that user enough impressions to make him or her convert if it was going to happen. No point in optimizing against that user now, you already know the outcome.
The 100 Millisecond Response
Moving away from statistics, let’s address the myth that “no human can make decisions within the 100 milliseconds we have to make an RTB decision.” That’s correct, but algorithms don’t, either.
The reality is most RTBecosystem participants cache their line items and, therefore, their bids, so they can respond within the 100 milliseconds and not be timed out. In order to cache these line items and bids, the algorithm has to work independently of the bidding, establish new lineitem values in the system, and then allow the system to cache those. Even though they’re working independently, they theoretically could repopulate and recache thousands of times per second. Are they? Find out for yourself by asking your RTB rep. If proud of the answer, he or she will tell you.
Bringing It All Together
At this point you might ask what the point of an algorithm is at all. The point of this piece isn’t to pick on any specific algorithm but to give color around the lunacy that says any digital media algorithm can work magic.
If you’re executing a $50,000 buy with a vendor, take some time and do the math before you decide to just leave it to people who say they have an algorithm. A good algorithm should be transparent, and the company’s work and limitations behind it should be as well. Companies should tell you when they can improve performance and when they can’t, when there simply isn’t enough data.
And they should be willing to give you the data if you would like to review it or make decisions yourself. If I looked like George Clooney and wanted to try to get a date, I wouldn’t go out on the prowl with a bag over my head. Those who are confident in their product will show it off and answer your questions without a guarded response.
Follow Jay Friedman (@jaymfriedman) and AdExchanger (@adexchanger) on Twitter.
Are you saying that the actual Algorithms should be shared with all the clients and agencies. Wouldn’t that be giving away a recipe for success. Granted some people take advantage of this “black box” and pitch a lesser product that ultimately will not perform. If your product is sound and Algorithm is finding users better than its competitors showing back end analytics should be enough. Im wondering what do you define as transparency with in an Algorithm?
A company promoting its algorithm as its secret sauce should explain to the client and agency exactly what the algorithm does that makes it so amazing. In simple terms. They don’t have to give away the exact equation, but they should show what it’s doing and what it’s optimizing to (beyond “conversions” so the person spending the money knows what it’s getting.
The problem with algorithms in the advertising industry is that 99.99% of marketing people don’t know what an algorithm is other than it sounds interesting. Plus, the algorithms can’t be very sophisticated as they have to make extremely fast decisions with way too much data to work from, so mostly they have to be dumbed down to be able to deliver an ad fast enough for the available inventory, and the algorithms are not even customized to the advertiser but for the network. So, if you are working from multiple networks, using multiple algorithms, it is all just a big data cluster that offers very limited value, but it sounds cool to say you have data sciences and algorithms, blah, blah, blah. The advertising industry in general doesn’t even understand big data or what to do with it yet, especially the 20 year old media buyers that studied advertising and history in college, or the 40 year old marketing execs that learned how to use spreadsheets from Lotus1,2,3. The industry has so far to come and this is just another one of those industry buzz words that means nothing to 99.99% of the industry.
You can never actually “calculate” a bid in 100 milliseconds. All you can do is to have a feedback loop which calculates ever x hours and sends the results to all the bidders.
Black boxes work great, until they don’t. Nobody cares about your algorithms, they just want stuff to work. This is part of why some providers have chosen just to be ad networks and not tech providers – a lot of the market just isn’t ready to manage algorithms vs. campaigns. It’s changing, but not as fast as some would like, or have you believe.
Great analysis. Your sample size challenges are just like the ones Yieldex had to solve for premium publisher forecasting. One tidbit, for example: you mention 3.6b possible combinations, but in fact in a given day there can’t be more combinations than impressions. And in our research, for a typical 1 billion impressions, there are only about 100 million actual combinations. This helps narrow the problem – but it’s still a pretty tough one. Thanks for the article.
ts
I’m with you on this one, Rob. No one cares about the algorithms – they are a means to an end. That end should be the KPI’s of our clients – business metrics, not digital metrics. Any explanation of an algorithm should relate back to the client. Unless we answer the question: “What’s in it for the customer?”, we are not really opening up the black box. Adtech blather about maximizing bid price yield spreads or the number of calculations per ms might make us feel smarter, but it’s not really demystifying the black box.
We need to articulate how our solution works – in plain English – and why this will help solve a specific business problem for our customer. Are you getting me more customers? More profitable customers? Lowering my costs? Then we need to prove it through empirical testing and measurement. Today I think far too much time/energy is invested in developing and marketing supercool algorithms and too little time invested in exactly how are going to validate and measure the outputs – in terms of the business KPI’s that our clients care about: customers, profits, cost et cetera.
Rob and David
To start, I’m in agreement with both of you on your overall point. The issue, though, is that media buyers absolutely get swept up by algorithm talk. Companies that have built themselves on black box algorithms – and not just recent companies – have secured a tremendous amount of revenue through this exact strategy. “Buy us because the machine is brilliant and no one else has the math we do.”
If I’m a CMO and my agency is spending significant money with folks like this, ESPECIALLY if my agency is still determining conversions on a lastadseen or last click basis, I care a great deal. Algos that win the last seen/click game aren’t benefiting anyone but themselves.
In those rare instances where a marketer has a full attribution program in place, I agree KPIs are all that count. But buyers still need to make a decision on who to test. Hopefully when they read this and comments like yours buyers will dig deeper and push harder to be presented with real, understandable value before they spend their money.
Its not the algorithm, its the data and its quality. Couldnt agree more on the last sentence which applies to all products.
Great article.
You bring up the very valid issue of impression size, but you seemed to brush over the idea of the amount of conversions. What did the statisticians have to say about the number of conversions necessary for the algorithm to make statistically significant determinations? If you ask them, they will tell you that only a modest amount are needed to look at some of the “base” metrics, such as Exchange, or dayparting. Even having the algorithm weigh these two metrics when bidding is worth it. Imagine trying to manage a campaign with a strategy for every one of the permutation of those two elements?
You bring up the news website analogy, which I believe somewhat misleading. You make the point that there may be some dayparts during which that site will not convert. However, that does not discount the fact that we should outperform the average by simply focusing on that website. Surely there will be waste if we do win impressions during times which do not perform. However on the whole, statistically, we should still outperform the alternative. We are playing a game of dice looking for statistical arbitrage and not deterministic arbitrage.
You are simply throwing out the baby with the bathwater by dismissing algorithms for not being able to make a statistically significant determination on every permutation of an impression possible.
In the end, it still beats some planner that pushes money to some publisher because that publisher took them out to dinner. You forgot that often the most faulty, biased algorithms are the ones that sit within our minds.
Jay –
An important variable to add to your math equation, and this one wreaks havoc on sample conversations and any absolutes about algorithims the creative. All the math you have done is only relevant if the creative stays exactly the same. One change in a creative element and the equation building needs to start over. Getting an appropriate sample size to make 100% carbon based decisions is just a tough nut to crack.
Good article. Agree that transparency is good in any business partnership. One alternative might be to never work with one and only one DSP when doing programmatic buying. Might working with at least two DSPs provides a better Execute – Test – Learn environment? In manufacturing, companies like Honda, would never work with a single supplier of parts, as they have learned that multiple suppliers helps them better achieve their business objectives. Might multiple DSP partnerships better serve an organization in terms of achieving its’ overall marketing objectives?