Home Ad Exchange News Dstillery Chief Data Scientist Talks Data Science And Paid Advertising

Dstillery Chief Data Scientist Talks Data Science And Paid Advertising

SHARE:

Claudia PerlichDstillery’s investment in data science dates to 2008, when the company – then known as Media6degrees – first applied big data analysis with a social skew. Since then it has broadened its focus to a wider range of digital media, but has retained its emphasis on the data.

The person in charge of those efforts today is Claudia Perlich. As chief scientist, she oversees the data and algorithms that drive decisioning, targeting and fraud detection for the programmatic ad network.

She started four years ago at Dstillery, where top clients include American Express, AT&T, Time Warner Cable and Neiman Marcus. Previously she worked for IBM Research focusing on predictive modeling and marketing.

Perlich spoke to AdExchanger about data science as it relates to paid advertising, programmatic methodologies and cross-device targeting.

AdExchanger: In what category do you bucket yourself? Any interest in launching a “self serve” business?

CLAUDIA PERLICH: The “bucket” question has always been a challenge for us. Terry Kawaja puts Dstillery in the “Targeted Ad Network” bucket on his LUMAscape, which is probably the most appropriate. But our services encompass the core of a DSP and the core of a DMP. We are a company focused on identifying new customers for our marketing partners, and we provide a full suite of services allowing those partners to execute on that intelligence across any channel.

Dstillery operates a robust self-service mobile advertising business that is used by several leading advertising networks. We acquired this platform a year ago when we bought Everyscreen Media. This coming fall we plan to expand the self-serve capabilities beyond mobile and add significant multiplatform functionality. Our goal is to allow clients to access all of our capabilities on either a managed-services or self-serve basis.

How many total advertisers are running on the platform? Any advertisers buying direct, or is it all agency relationships? 

Dstillery works primarily with agencies and their trading desks. In some cases we work directly with the advertiser, primarily when they are handling all of their media buying in-house.  Dstillery works with hundreds of marketers.

Who does Dstillery compete with? 

On the competition side, in terms of targeting, there are a number of players I recognize as having a similar approach to utilizing data. I’m not sure how strong Quantcast is in the market, but I really respect them from a data perspective. Some people would say we’re similar to Rocket Fuel. The problem with the question of direct competitors is that other than some of the big players (Yahoo, Google), very few of the targeting companies are willing to share how they work. We actually publish our approaches.

Subscribe

AdExchanger Daily

Get our editors’ roundup delivered to your inbox every weekday.

Also, companies similar to Dstillery are typically much larger in scope. We have very good connections to Yahoo, and the same holds for LinkedIn, Facebook and Google. I know from my contacts what’s running under the hood at those companies and I think that it’s very close in spirit to what we’re trying to do. The primary difference is that our core business is very different from theirs.

How does Dstillery apply data versus its competitors?

We’ve built a privacy-friendly way of identifying who is really interested in a product. Instead of trying to bucket people into demographic or behavioral groups, we can operate using a consumer’s very specific sequence of actions. We see what consumers are reading online, or sites they visit. Next, we’ll build large-scale learning models that predict the probability of how browsing histories relay consumers’ interest. This approach is individualized to every brand or product, and it’s also individualized to every consumer on the other end. There’s no grouping involved in the process.

Are there common methodologies around using data in programmatic? Are there common mistakes?

The question is really, what data are you bringing to bear? Different players in programmatic have different perspectives. The first way is to only use first-party data. You buy data segments, and can therefore bid based on information provided to you. The second way is to do contextual advertising, and that is not specific to programmatic. But first-party data only gives markers a tiny window into what the consumer is doing right now. And third-party data gives you somewhat unreliable additional information. The other information people bring to bear is CRM data, which they integrate into decisioning. The last method is adding, in addition to these segments I already mentioned, additional first-party data, such as browser history in our case.

One thing we have observed consistently is that third-party information that lends aggregate information about demographics or behavior tends to be highly unreliable. This is the fight between scale and quality. We have seen issues around data quality. If I have first-party CRM information and browser history information, I already know so much about the consumer that this aggregate information typically doesn’t add any value to what I already know. More mistakes I see in the space involve metrics, how you measure performance of campaigns and how you look at attribution.

What major measurement errors are you seeing?

People underestimate how well we can make certain metrics look good. That’s what these algorithms do. They’re excellent at optimizing toward a specific metric. We’ve seen this on click-through rates repeatedly, and this is dominant in mobile right now. The example is this: If you want to have a campaign with high click-through rate on mobile, just throw ads on the flashlight app. It’s a whole bunch of people fumbling in the dark to turn the thing on or off. It doesn’t mean there’s any interest in the product. It just means that it’s an environment where the usage of the device leads to higher click-through rates.

On the desktop side we’ve been looking at attribution as a problem. As long as you’re still looking at last touch attribution, you’re basically bound to favor approaches that buy cheap ad impressions across the board in order to be the last to get that touch point. In terms of the marginal value, most of them are completely invisible. So, you’re driving your campaign toward something that doesn’t add value but that makes a particular statistic look good.

How is the industry using data science for cross-device targeting?

There are a number of companies who have the ability to associate different devices simply because they have some form of a login. Very few of these companies have any incentive to share this information with the market. The smaller companies who have login-based capabilities typically have very small coverage so they can only associate a small percentage of the population.

Everything else in the market, and where I position our work, does probabilistic matching. We look at the usage pattern from one device to the next, in order to identify specific overlap that hints that it’s actually the same person. The primary example here is a home network, where people use both their mobile and desktop devices. There are very few people who use multiple devices on that network.

What are the main challenges and opportunities big data presents for cross-device targeting?

Cross-device advertising hinges on the ability to associate individuals across multiple devices. It’s very simple to run a campaign and show 20% of the ads on mobile and 80% on desktop, but that’s not true with cross-device targeting. It completely ignores the connection between both devices. The challenge for cross-device targeting is in the identification or assignment. This is a technical problem, and most players in the space have similar technology.

What I’m trying to communicate to brands that want to use that technology is that there’s a trade-off here. It’s typically not about reaching the exact right person. Lets say I cross-device target you on your desktop, and instead of identifying you on your phone, I identify the phone of your partner, husband or wife. That’s actually not bad. The rate should be about finding people in your environment. Often, they’re just as good of a target.

Ultimately I don’t think targeting should be about tracking specific individuals with 100% certainty. I don’t think we want that from a privacy perspective or even from a marketing perspective. But there’s a lot of value in getting just close enough to a target’s environment.

What problems do NewSQL databases solve?

They’re clearly part of the new big data environment, and they serve very different needs. They’re not specifically limited to multigenre analytics, but are used for many different environments including, on our side, cross-device association.

NewSQL databases work like a very large look-up table. You can give me all the information you’ve stored for one identifier and you don’t have to have a very stringent data format.

One of the problems with SQL is that you have to prestructure all the information. You have to have one field that is “age” or one field that is “income.” In the modern NewSQL, you don’t have to prespecify all the different data points you want to use and what each one means.

Now, what does NewSQL buy you? It’s mostly about speed and flexibility. It’s speed in the sense that sequence-based databases are slower because they have a lot of data overhead. They cannot respond at the speed we currently need in the programmatic environment. We have about 100 milliseconds to do all the decisioning for programmatic advertising. In that time, we want to look up all your history and what device you’re associated with. That’s where the NewSQL comes in.

What are the next sources for big data collection? Social media? Wearables?

On the social channels we already have a large stream of data, and it’s more a matter of what is acceptable and what isn’t. How much information is Facebook willing to share, for example. We will continually have new social channels and some of them will share data, but we can’t possibly rely on that. In the small scale it could give us some substantive data that could be very useful.

I think wearables is a very interesting direction. Increasingly, I see people developing really cool devices that include partial analytics. Whether they are willing or able to share that data as a way of funding their product initially, I don’t know. But it would give us more specific views on what consumers do. Wearables might eventually be a good source of habit and location data. I don’t think we’ve fully understood the commercial potential on the marketing side, but I think wearables can add a tremendous amount of volume and diversity. 

What are the key points around fraud and paid advertising?

My sense is that we’re in fraud fighting for the long term. It’s the same as email spam or the like. It’s not going to go away, it’s going to change its flavor. Everything we build keeps changing, but fraud will persist. There are two things I would advise. Firstly, marketers need truly trusted partners. Transparency is key.

Secondly, if any results of marketing campaigns look too good to be true, they probably are. I would like to see some healthy skepticism when it comes to exceedingly good performance, because almost always there are problems lurking. Be realistic when it comes to what can be measured and what can be achieved.

What else?

I’m curious to see how well the industry will integrate all the different pieces of the puzzle into digital advertising and so on. I think the opportunities for video and for TV are tremendous. I think we’re a few steps away from full integration to get all the components worked into a comprehensive consumer view that is no longer siloed. That’s what I see, a continuous push for a complete integration of all these different data streams and opportunities to reach consumers.

Must Read

A comic depicting Judge Leonie Brinkema's view of the her courtroom where the DOJ vs. Google ad tech antitrust trial is about to begin. (Comic: Court Is In Session)

Your Day One Recap: DOJ vs. Google Goes Deep Into The Ad Tech Weeds

It’s not often one gets to hear sworn witnesses in federal court explain the intricacies of header bidding under oath. But that’s what happened during the first day of the Google ad tech-focused antitrust case in Virginia on Monday.

Comic: What Else? (Google, Jedi Blue, Project Bernanke)

Project Cheat Sheet: A Rundown On All Of Google’s Secret Internal Projects, As Revealed By The DOJ

What do Hercule Poirot, Ben Bernanke, Star Wars and C.S. Lewis have in common? If you’re an ad tech nerd, you’ll know the answer immediately.

shopping cart

The Wonderful Brand Discusses Testing OOH And Online Snack Competition

Wonderful hadn’t done an out-of-home (OOH) marketing push in more than 15 years. That is, until a week ago, when it began a campaign across six major markets to promote its new no-shell pistachio packs.

Privacy! Commerce! Connected TV! Read all about it. Subscribe to AdExchanger Newsletters
Google filed a motion to exclude the testimony of any government witnesses who aren’t economists or antitrust experts during the upcoming ad tech antitrust trial starting on September 9.

Google Is Fighting To Keep Ad Tech Execs Off the Stand In Its Upcoming Antitrust Trial

Google doesn’t want AppNexus founder Brian O’Kelley – you know, the godfather of programmatic – to testify during its ad tech antitrust trial starting on September 9.

How HUMAN Uncovered A Scam Serving 2.5 Billion Ads Per Day To Piracy Sites

Publishers trafficking in pirated movies, TV shows and games sold programmatic ads alongside this stolen content, while using domain cloaking to obscure the “cashout sites” where the ads actually ran.

In 2019, Google moved to a first-price auction and also ceded its last look advantage in AdX, in part because it had to. Most exchanges had already moved to first price.

Thanks To The DOJ, We Now Know What Google Really Thought About Header Bidding

Starting last week and into this week, hundreds of court-filed documents have been unsealed in the lead-up to the Google ad tech antitrust trial – and it’s a bonanza.