Mobile Data Has A Quality Control Problem

This is the second in a series of deep dives from AdExchanger on mobile fraud and mobile data quality, including guides to fraud tactics and threat vectors and practical solutions from advertisers in the growth and user acquisition trenches. Read the first story (“2018 Will Be A Year Of Reckoning For Mobile App-Install Fraud”) and the third (“Anatomy Of Mobile Ad Fraud: Web Vs. App“).

Mobile data for targeting in the programmatic ecosystem is everywhere. But high-quality mobile data that’s what it purports to be? Not so much.

“The problem is knowing what we’ve bought,” said Sarah Melrose, programmatic director at Sydney-based WPP media agency Ikon Communications.

For instance, publishers often append location data to bid requests to try and drive up the price of their inventory, but agencies like Ikon don’t know if they’re getting precise GPS data or the generality of location pegged to an IP address.

“We plan around specific postcodes or points of interest and what we actually end up getting is less accurate,” Melrose said. “In an ideal world, I’d like to see less inventory available, but higher accuracy in the data.”

More than half of all impressions in the US include location data, according to mobile data provider Mobilewalla, but around 80% of that location data is junk, said Mobilewalla CEO Anindya Datta. Recent research from location tech company Thinknear found that 30% of all “hyperlocal” location data in the programmatic ecosystem was accurate to within 100 meters of a user’s actual real-time location … aka, the size of a football field.

Crap attack

The exchanges are also awash with imprecise demographic data. The majority of ad requests in mobile exchanges don’t even include age data, and of those that do, 60% are off the mark, according to a study by Sprint subsidiary Pinsight.

Ikon does what it can to test the data it uses to power its mobile performance campaigns, but it can still be a crapshoot.

“We use Nielsen for demo verification and a surprising amount of data is still highly inaccurate,” Melrose said.

In some cases, the blunders are borderline amusing. Mobile data marketplace Tapfwd once received data from a reputable data provider for a segment of “non-smartphone owners” tied to Apple ad IDs, said CEO Alex Wasserman.

But it’s no surprise that mobile data is unreliable – there’s no standard way to collect it.

“Take age data,” said Scott Swanson, CEO of mobile ad platform Aki. “One app might say, ‘Enter your age,’ and another will say, ‘Enter your birth date,’ not to mention the fact that some people fake their age. The data’s out of whack.”

In other cases, apps that aren’t on the level purposely obfuscate the media source so an advertiser might think it’s buying a certain type of audience, but it’s actually getting who knows what.

Crap tactics

Anomaly detection is the primary method for fighting fraud and separating out poor-quality mobile data.

“You observe the patterns of real customers to find behavior that’s outside that box,” Mobilewalla’s Datta said.

In the location data example, brief snapshots of what people are doing and where they are don’t tell the whole story. If a device appears to be in San Francisco at a coffee shop, that makes sense … but not if that same device shows up in Philly at a mall an hour later.

Then again, location signals aren’t usually static. Over the course of a day or several days, a device will move around during the morning and evening commutes, to lunch in the middle of the day, to dinner or a movie in the evening.

When, for example, location tech company GroundTruth (formerly xAd), sees a location signal pegged to an ad impression, it creates a time stamp and tracks subsequent activity to make sure it looks human.

“We need to look at the quality of the location we receive in the moment, but also historical data to develop a pattern of how people are moving around,” said Bhishma Savdharia, director of business development at GroundTruth.

The problem is being able to suss out all the red flags at scale. Mobilewalla, for example, which appends mobile usage data with inferred attributes like age, gender and income to create segments for targeting, operates in 30 different countries and observes around 25 billion bid requests a day.

That translates to roughly 100 terabytes of data daily, which really starts to add up over time.

To handle the vastness of the volume, Mobilewalla developed a compression algorithm to store and analyze more than two years of programmatic bid requests, and it uses a machine learning algorithm to identify fraud and scrub mobile audience data before it gets parceled into segments.

“Signals and ad requests are constantly coming out of a device, and we need to look at the rate at which legitimate requests arrive as people navigate,” Datta said.

Dentsu Aegis-owned programmatic agency Amnet has been buying mobile audience data from Mobilewalla for its programmatic campaigns in APAC with positive results, said Conrad Tallariti, Amnet’s GM for Southeast Asia.

“Without clear insights into the audience you’re engaging, you run the risk of targeting the wrong audience, or reaching the right consumers at an inopportune time – and there is no greater sin than wasting your audience’s time,” Tallariti said. “As the industry matures, we’re finding that for data, the value of quality over quantity cannot be overstated.”

Enjoying this content?

Sign up to be an AdExchanger Member today and get unlimited access to articles like this, plus proprietary data and research, conference discounts, on-demand access to event content, and more!

Join Today!