User-generated photos and videos are a largely untapped resource for advertisers. The challenge is leveraging the inventory.
A few startups have offered up solutions: GumGum helped L’Oréal place ads on publisher sites based on the hair color of women in photos and Stipple lets consumers make purchases directly from an image.
British ad tech firm WeSEE is another company that has jumped into the fray with its image and video classification solutions. Its customers (WeSEE wouldn’t disclose the exact amount) include Virgin Holidays, Avis, HP, Cathay Pacific and Intel. It also works with agency trading desks, like Omnicom’s Accuen.
Accuen UK, for instance, has incorporated WeSEE’s technology since January.
“We see (haha) them as a great complement to other contextual/semantic data providers,” said Edward Thomas, Accuen’s head of product. “Their image classification tech gives a different interpretation of page content, and thrives in image-heavy environments common to social media platforms.”
AdExchanger spoke with WeSEE’s co-founder and CMO, Adrian Moxley.
AdExchanger: What problem are you trying to solve?
ADRIAN MOXLEY: We launched in 2010 and our goal was to classify the digital Web. We saw a trend where people were becoming editors of sites and sharing digital content, such as photographs and videos, and we saw an opportunity to put verified and brand-safe content into a recognizable category so that it can be targeted by advertisers.
How do you turn images into text?
We have our own proprietary technology that’s based around CNN (convolutional neural networks), as well as visual semantics. In other words, we create algorithms from visual content that can automatically self-learn what an image is and categorize it. Our magic sauce is understanding an image in a particular environment – is it user-generated or a professional image – and how the user interacts with it and what it means for the advertiser.
AdExchanger Daily
Get our editors’ roundup delivered to your inbox every weekday.
Daily Roundup
Do you use any tags or pixels?
We don’t use tags. If we’re processing a Web page that uses images, we don’t need any metadata. Our technology is also cookie-free. We don’t follow users around the Web.What differentiates us from other contextual data providers is we don’t require data – we generate data. We turn images into text keywords. If there is text on the page we’ll process it too, to provide full insight into the page.
What about video? How do you process video content?
We’ll be launching a video product soon but there are already a number of ways we can process video. We can do it through machine learning, viewer networks, visual semantics and data crawlers on a website.
Just to confirm, WeSee’s technology can already be applied to video content but you’ll be launching a video-based product in a few months, is that correct?
We are able to analyze and classify video files now, however we will be launching a crawler capable of extracting video content automatically from a web page.
Can you give me an example of an image or video that an advertiser was able to target against?
Well, right now for example, we have a partnership with AppNexus where we crawl through Web pages and our technology records what it sees on the page. That could be key words or the image that appears on the Web page. We found that 30% of the Web page classifications that are currently in AppNexus change when you take into consideration the image that is on the page. Visual content can actually change the “flavor” of the Web page.
What do you mean by changing the flavor or nature of the page?
Take a Web page dedicated to automobiles for instance. On that page, we know it’s about cars, but maybe there’s an image of a Land Rover parked on a beach. That page is obviously attractive to auto advertisers but it also might be attractive to car rental companies because of the beach or travel or holiday component of the page.
Or if you uploaded a photo of your new Land Rover on the beach in your Facebook photo album, our system can recognize that image and put it into a number of advertising categories like summer, travel or beach or automobiles. We’re creating a new advertising sector from that page.
Do you work with Instagram and Pinterest?
We’re not working with them at the moment, but those are the kinds of sites that would benefit from our technology since we can categorize their images to make sure it’s brand-safe and classify the content. Our goal is to make visual content, which is the fastest-growing type of online content, as valuable and tradeable as textual content.
Do you have your own demand-side platform (DSP)?
We don’t have our own DSP. We’re an agnostic technology. We don’t have any plans to set up our own DSP and we can work with anyone in the ad ecosystem.
How does your technology work on mobile apps?
We can work with apps as content is uploaded. An example of that is when you upload that photo of your new Land Rover onto a social media app. We can process that image at that point or we can process the content on the Web version of the product and that would also be targeted with advertising on the app.
What is your business model?
We charge on a CPM basis. The advertiser pays for the targetable data that we’re providing them. Depending on the volume, the cost ranges from about $0.10 to $0.15 CPM.
Do you have any plans to offer a viewability solution?
We do have that on our road map. If we can understand an image or a video, there’s no reason why we shouldn’t also generate data about viewability.
What else is on your road map?
Our main focus is the US and we’re looking to launch offices on the West Coast and East Coast and we’ll be running more US campaigns. And like I mentioned earlier, we’re launching our video product in the next couple of months and that’ll be a big area for us. Our future is in video. Right now there’s a short supply of premium video and we see an opportunity to make user-generated video valuable to advertisers.
Corrections: The original article stated CNN stood for “computational neural networks.” The C actually stands for “convolutional.” The headline also described WeSEE as an “in-image ad targeting provider.” This is not accurate and has been amended to “visual ad tech company.” Finally, the image identified Moxley as CTO. He is in fact CMO. All errors have been corrected.