MICHAEL TIFFANY: Because ad fraud is, in some ways, a better crime. No matter how stealthy your tech stack is, or how awesome your root kit is, once you steal the money, someone tends to notice that it’s missing. When you succeed at ad fraud, by its very nature, no one notices. If it succeeds, it completely flies under the radar. You just collect money. You don’t need money mules for sending anything via Western Union. The people who are developing these tools are the world’s best, and make an astonishing amount of money. They’re primarily going head to head with the best big data analytics that an ad company can buy. That’s just a huge mismatch.
The crime is perpetuated with botnets, so they’re composed of real people’s computers. And those real people browse the Internet and buy stuff. The root kits have read-write access to the hard drives of these real consumers, so they know that if they can relay the cookies and metadata associated with actual people, then they can get their bots targeted. Not only that, they get their bots whitelisted as known good consumers, people who just bought something yesterday. That never gets priced out because there will always be a price for targeting people who are known to be good consumer. If you’re adding to the total number of apparent impressions to those known good consumers, this doesn’t raise any alarms, because we all know that the volume of impressions relative to final actual sales is so big. You can inflate those quite a way and make millions of dollars while the targeting systems still flag this stuff as high-value impressions.
How does your detection technology fight against this?
Our service gets integrated into a Web session. In the ad context, that means we’re in the ad. So when the ad is displayed, whether it’s a mobile ad or a display ad or a video ad, a little bit of our service is also downloaded along with that impression.
We basically worked out a way of doing remote malware detection, which finds bots with no trouble. And it means that our results are super fine-grained. We can tell the difference between a legit impression and a fake impression served to the same computer.
We’ve been able to shake the market up fairly quickly because we’re not presenting people with sort of Bayesian suspiciousness. It's hard evidence. It’s the kind of thing designed to put bank robbers in jail, so it’s substantially harder to argue with. And it also means that we’re finding not just the pure fraud websites that hide out in the long tail, but since we are on an impression-by-impression basis we can also see when name-brand legit publishers are sourcing bot traffic in one way or another.
How does that affect the overall ad industry fraud problem?
This is one of the major reasons why the problem is so severe. It turns out that major publishers are often themselves buyers of advertising, or they have affiliate deals, or they work with an audience platform, or search engine marketing turns out to be riddled with bots. All of these measures end up bringing bot traffic in the door, because the way that those deals are done monetizes volume. Basically any time cash is changing hands based on volume, there’s an incentive for one of those parties to goose the numbers with bots.
So what we found is that it can be actually hard to follow the money all the way to the bot operator because there can actually be these middle men in between that act as traffic washers. But ultimately the name-brand advertisers are being defrauded by serving impressions to these bots when they end up on premium inventory.
Can you walk us through a case study?
We’ve worked both on the buy side and the sell side. Advertisers sometimes just bake our detection into their ads and then we share the results across the board. We’ve been getting cuts on the sell side, like at the ad network and exchange level, where these guys just want to purge the bots from all of their inventory. They like working with us because we are not coming to them and saying, “We do better big data analytics than you.”
How do you work with ad networks?
If you were a big ad network, you have some super brains sifting through the data. We just have this entirely new set of data. So they’re incorporating it to do two things. First, you kick off the pure fraud players, and we’re always finding surprises, because there’s some sophisticated operations. They hide in the noise by operating what is truly a network of what we call cash-out sites. These are fake websites. None of which take checks that are too big. You might have a hundred such sites. They all look disconnected. They all make maybe 10 grand a month. Which means you’re really making a million dollars a month across all of the sites. But at the 10 grand level you’re not raising any eyebrows, it’s not going to come above the noise floor, so no one notices. So step one is to locate those, and what you have to do is just kick them off.
Where does fraud detection get more complicated?
The harder case is the mixed traffic where some website has a human audience but then the bot traffic is essentially goosing the numbers at this point. And I thought at first that these guys just needed an awkward talking to, but sometimes they’re unaware of how they’re sourcing the bots, or they can’t do anything about it. The fact is, you can’t prevent a bot from visiting your website. In fact, sometimes the bots come to your website not even because you’re in on it, but because they want to pick up a cookie from your site so they can go get retargeted somewhere else. So we’ve been working at the network, the exchange and the ad-server level to do real-time distinguishing between the human visitors and bot visitors, to then only serve billable impressions to one and not the other.
But this turns out to be tricky because we don’t want to signal back to the bad guys in real time that they’ve been caught. So you can’t just not serve an ad, or serve a fake ad, because they’ll notice and realize, OK, time to change tactics. So we’ve been doing a lot of actually kind of hidden work on the back end to just not bill for that bot traffic.
How aware is the marketing world when it comes to the scope of this problem?
They’re unaware of the scope and they’re unaware of how it’s affecting them. Here’s what we’re seeing again and again. People think, “Oh yeah, we all know about fraud, there are these bogus websites. But I'm only buying premium inventory through my premium ad agency so of course it can’t affect me, because my campaigns are some of the more sophisticated targeted ones.”
So the big surprises here are that the bots are on premium inventory and that they’re actively gaming the targeting systems. That’s what really raises eyebrows. And that’s what we’ve spent the past few months following our public launch kind of raising awareness of. It’s not just on the fringes, this is a parasite that’s beginning to eat its host.
Why did you go after this problem in particular?
It’s a deep flaw. And it’s especially distressing that so much money is going to the actual people who are breaking into grandma’s computer. So this is the big kind of moral calling that is driving us to this, and we wear that on our sleeves. It’s not attractive to everyone. Some people consider it really naïve, but those that are receptive to that find it deeply compelling.
It turns out that we’re on this messianic mission to change the world. That’s led to our aggressive early adopters, who tend to be people who had already been looking at the numbers and thinking, “This kinda doesn’t make sense.” So far, the way we’ve built this business is not by trying to convert the whole world to our side, but to seek out those people who already thought, “Something’s fishy here.” And then, when they see our numbers and they hear our findings, everything falls into place. They’re like, “Aha, that’s it.”