“Data Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media.
Today’s column is by Douglas de Jager, founder of Spider.io.
In Will Luttrell’s AdExchanger article of last week, he argues that if the display advertising buy side adopts better performance metrics then the problem of fraud can be solved across display advertising.
Luttrell is conflating two distinct problems across display advertising: (i) unscrupulous exploitation of broken performance metrics and (ii) fraudulent gaming of performance metrics.
This article will argue that no new set of performance metrics will prevent fraud. It will discuss how Google’s Ad Traffic Quality Team (formerly the Click Quality Team) has over time come to police Google’s search PPC ad exchange, namely AdWords. It will argue that display ad exchanges and other suppliers of display ad inventory (ad networks, supply-side platforms, etc.) should police the inventory they sell in much the same way. It will illustrate the surprising extent to which display ad exchanges and other suppliers of display ad inventory currently fail to police their inventory. It will also suggest an explanation of why this is so.
Display advertising fraud is pervasive
Fraud is endemic across today’s display advertising ecosystem. Luttrell estimates 20 to 30% of display ad inventory is fraudulent. For an empirical counterpoint, the Chameleon botnet spanned all the major US ad exchanges at the time of disclosure and was driving enough fake traffic—on its own—to account for 30% of the display ad inventory sold through one of these major exchanges. The Chameleon botnet is not unique. For example, we will shortly be disclosing details of a second major botnet, with a distinct signature, targeting a distinct cluster of websites. Botnet traffic is also not just limited to fraudulent publishers. We have come to understand that the Chameleon botnet, for example, has supplied traffic both to the website of one of the most highly regarded US newspapers and also to the websites of one of the world’s largest online media companies.
If the online display advertising industry is to win over advertising spend from other advertising channels, like television, radio and search PPC, then the structural failures that currently allow fraud to be so pervasive across display advertising need to be tackled.
No new metrics will prevent display advertising fraud
Not only will no new performance metrics solve the problem of display advertising fraud. There is in fact no static solution to the problem of fraud across display advertising.
Luttrell is correct in suggesting that today’s performance metrics are broken. At spider.io, we have reported on the buying patterns of advertisers across select remnant eBay display ad inventory where the average viewability rate of this remnant ad inventory is 14%. spider.io has also reported on the buying patterns of advertisers across select Facebook apps where the average viewability rates are only marginally above 0% -- made still worse because the ad impressions are auto-rotated every 30 to 45 seconds. Unfortunately this sort of behavior is commonplace, though seldom discussed, with buyers often looking to intercept view-through CPA credit by cookie-bombing highly trafficked websites that are regularly revisited by users.
This unscrupulous exploitation of the view-through CPA metric is a significant problem for the display advertising industry, and I agree with Luttrell that savvy advertisers should push for performance metrics that better reflect the real value of display ad inventory.
However, no new performance metric will prevent fraudulent gaming. Ad viewability, as measured by the position of the ad impression relative to the browser’s viewport, will not help if the browser is being powered by a bot. Not even a move toward measuring some improved attribution metric would prevent display advertising fraud -- just as measuring attributable actions would not prevent AdWords fraud.
If buyers are going to continue buying display ad inventory on a CPM basis, then any metric for attributable actions will always allow room for fraudulent inventory to be bought. This is because not every valid ad impression results in an action -- meaning that it would not be possible to make any fine-grained distinction between a fraudulent ad impression and an unconverted but valid impression. Recall for example that the Chameleon botnet was supplying traffic both to the website of one of the most highly regarded US newspapers and also to the websites of one of the world’s largest online media companies. It would not be have been possible to identify the Chameleon botnet’s contribution to traffic across these sites by appealing simply to causally attributable conversions.
If buyers were instead to start buying display ad inventory principally on some form of CPA basis, then this would also be open to fraudulent gaming. This is because no CPA metric has a secure underlying method for attributing actions to display ad impressions.
This subject warrants an independent post, but here are two of the more obvious points. Firstly, most conversion tracking is performed client-side, and spoofing conversion pixels has already been shown to be easy. Many travel search engines are vulnerable to this sort of spoofing, as their conversions comprise exit clicks rather than purchases (they sell clicks to airlines and hotels). However, even websites with purchase actions are susceptible. Secondly, the Bamital botnet has already provided a real-world, large-scale example of how a botnet can intercept CPA credit. Suppose a user has an infected PC. Suppose that a user searches on Google.com for a particular product. The Bamital botnet showed how easy it would be on the infected PC to replace any links on the Google search page -- organic links or search PPC links -- with an affiliate or CPA-monetizable link for this particular product. This would allow fraudsters to intercept credit for any visit to the product page from Google. There are many variants of this type of man-in-the-middle attack to intercept CPA credit.
Looking to search PPC for guidance
Efforts to prevent fraud across the search PPC exchanges like AdWords are markedly more mature than they are across display advertising. Google’s Ad Traffic Quality Team, in particular, now employs well-documented best practice to identify and prevent fraud across search PPC. This best practice was established during a particular class action lawsuit against Google, Lane’s Gifts & Collectibles, settled by Google in 2006 with a $90 million settlement fund. What constitutes best practice—in terms of what is reasonable and appropriate—is set out in a report by NYU professor Alexander Tuzhilin.
Information asymmetry is the reason the courts held Google as being responsible for preventing click fraud. Google has access to a great deal of information on the clicker before any click happens. Furthermore, if the clicker does not stay for a long time on the destination page and instead bounces, then the recipient of the click may never be in a position to determine whether the click was fraudulent. This information asymmetry is even more pronounced across display advertising, not least because display ad impressions seldom result in an analyzable click trace (with typical click-through rates on display ad impressions reported to be 0.06%).
In Professor Tuzhilin’s report, he discusses what constitutes reasonable and appropriate measures when it comes to preventing click fraud. At the same time, Professor Tuzhilin makes clear that it is not just acceptable to have the underlying anti-fraud machinery be hidden in a black box. Indeed, it is required. This is because advertising fraud is not a static solvable problem. The battle against advertising fraud is in fact an arms race. Information on fraud prevention will invariably leak, perhaps only indirectly through the leaking of blacklists, but this means that each new measure to prevent fraud will yield new fraudulent efforts, which in turn must yield new preventative measures, and so on.
If fraud is to be tackled across display advertising, then the dynamic nature of the problem needs not just to be accepted. It needs to be embraced. Ultimately the efforts of fraudsters will only be prevented if the measures to identify and prevent fraud keep changing quickly enough that it becomes financially unviable for fraudsters to keep trying to game the system.
The way to prevent display advertising fraud is the same way that search PPC fraud is tackled by Google.
The sell side’s approach to even the most basic checks
Despite the enormity of the online display advertising industry (it has been reported that by 2016 spend across display advertising is expected to overtake spend across search PPC advertising in the US), there are few efforts to prevent fraud. There is certainly no established best practice for the prevention of fraud.
There are three main classes of automated traffic: (1) where the User-Agent header is not that of a browser; (2) where the IP address is that of a cloud service provider; and (3) where a bot is masquerading as a typical website visitor with a browser User-Agent header and a residential or office IP address.
The third class of automated traffic is difficult to identify and prevent. The Chameleon botnet is an example of the third class of automated traffic.
The first two classes of automated traffic, however, can and should be filtered out easily by suppliers of ad inventory. Indeed, subscription to the IAB/ABCe International Spiders & Bots List is intended to enable these two classes of automated traffic to be filtered out easily. Yet despite the ease with which this can be implemented, advertisers will perhaps be surprised to learn that all of the main ad exchanges allow advertisers to bid for ad inventory when the User-Agent header is not that of a browser or the IP address is that of a cloud service provider.
For example, there are several sites like hawaiidermatology.com across which over 10% of the ad inventory sold through ad exchanges are being sold when the User-Agent header is not that of a browser. There are sites like cruisewhat.com across which over 90% of the ad inventory sold through ad exchanges is being sold when the IP address is that of a cloud service provider.
While all the main display ad exchanges pass bid requests and allow ad impressions to be served when either the User-Agent header is not that of a browser or the IP address is that of a cloud service provider, it is important to add a surprising qualification for one of the main display ad exchanges, namely Google AdX. Industry perception is that Google AdX is the one display ad exchange that is actively policed for fraud, however Google’s processes are somewhat unexpected. Whilst Google passes bid requests to advertisers for ad inventory associated with the first two classes of automated traffic, Google does not charge advertisers for the subsequent impressions. According to Google’s documentation,
"All filtration is performed 'after-the-fact' and passively. That is, the user (browser, robot etc.) is provided with their request without indication their traffic has been flagged."
Google’s approach to this problem makes some sense when one considers that this is how Google tackles click fraud across search PPC -- as click filtration typically has to happen after the fact. This said, many on the display advertising buy side are not aware of Google’s approach to filtering out the obviously automated traffic. This means that whilst Google does not charge buyers for these obviously automated impressions, these unfiltered impressions pervert the optimization engines of the buyer, and they also pervert the spend-control strategies of buyers. As a surprising illustration of this happening in practice, below is a display ad impression for Kiwi Bank being served through Google AdX to one of Google’s own bots, namely Google Web Preview, where Google.com is also the publisher. Google Web Preview is in fact one of the most common non-browser User-Agent headers across Google AdX.
Why is there no best practice to prevent display ad fraud?
In Professor Tuzhilin’s report, he reveals the surprising fact that Google was illegitimately charging for immediate second clicks on text-link ads until March, 2005. This is surprising for several reasons. Charging advertisers for duplicate clicks is clearly not acceptable. It was plainly Google’s responsibility to prevent charges for duplicate clicks. It was straightforward for Google to stop charging advertisers for duplicate clicks. Post-click user sessions were relatively auditable by advertisers. Google was a public company at the time. And Google was also one of the most admired companies in terms of perceived ethical standards.
The situation is many times worse across the display advertising industry.
Across the display advertising ecosystem it is not clear whose responsibility it is to identify and prevent fraud. This LUMAscape diagram shows how fragmented the display advertising ecosystem is. In the case of search PPC, Google acts as both the publisher and the seller of the ads. In the display advertising ecosystem there are too many disparate parties between the advertiser and the publisher—sell-side platform, a daisy chain of ad networks, ad exchange, demand-side platform, ad trading desk, etc.— for it to be clear who is responsible for preventing fraud.
It is also not easy to identify and prevent fraud. This is particularly so the further away the fraud checks are from the publisher. The path connecting advertisers to publishers is built on the assumption that “everything just works.” In practice, it seldom does, even without any efforts by nefarious parties to game the system. The more moving parts connecting advertisers to publishers, the fuzzier the audit trail becomes.
This article has examined how display advertising fraud has gone unchecked because of the fragmented state of the display advertising ecosystem. There are typically multiple parties between the advertiser and the publisher, and this has made it less straightforward to assign responsibility for preventing fraud across display advertising than it has been across search PPC. The audit trail on the display advertising buy side is also not clear enough for advertisers to identify inventory anomalies easily.
This article has also touched on the extent to which fraud blights the display advertising ecosystem. Early indications are that fraud is a substantially larger problem across display advertising than it might ever have been across search PPC. According to Google’s announced Fourth Quarter 2012 Financial Results, 67% of Google’s revenues were generated across sites owned by Google, and only 27% of Google’s revenues were generated across third-party websites. Display advertising is quite different. There is no major publisher that serves as the principal source of all ad inventory. Instead display ad inventory is distributed across many smaller publishers, and the financial incentive for these publishers to game the system is great.
In much the same way that search-PPC ad exchanges have been deemed responsible by the courts for policing search PPC advertising fraud, we have proposed in this article that display ad exchanges and other suppliers of display ad inventory should be held responsible for policing display advertising fraud.
If the online display advertising industry is to compete with other other advertising channels, then the leaders of our biggest display ad exchanges, like Brian O’Kelley of AppNexus, are right to make fighting fraud a top priority.
Email This Post