Home AI How Eyeo Worked With Students To Innovate On AI

How Eyeo Worked With Students To Innovate On AI

SHARE:
ad blocker

Eyeo is putting the “learning” in machine learning.

Last year, it partnered with a university student initiative in Munich to have students find new approaches to using AI in its online ad filtering.

As the parent company of AdBlock and Adblock Plus, two of the most downloaded ad-blocking services on the market, eyeo has invested a lot of time and money into developing its own deep-learning methods for detecting and filtering ads.

But eyeo had yet to explore how generative modeling could be used to curate and maintain ad filtering lists, which is typically a tedious, error-prone, largely manual process that is difficult to scale.

Eyeo’s extensions rely heavily on filter lists, such as EasyList, that are maintained by a community of mostly unpaid volunteers and contain tens of thousands of rules for blocking and hiding certain network requests and the HTML code that’s responsible for ad rendering.

The ability to automatically update these lists with accuracy would be a major benefit, said Dr. Humera Noor Minhas, director of engineering at eyeo.

Automation station

Last year, TUM.ai, a student initiative within the Technical University of Munich, put out a call for proposals for its Moonshot competition, a hackathon-style project for students interested in pursuing a career in AI.

In response, eyeo submitted a proposal to have TUM.ai students tackle the challenge of developing an AI model that automatically classifies URLs and generates filter rules based on website content.

Eyeo’s proposal stood out due to “its real-world impact,” according to TUM.ai student advisor Thomas Wölkhart.

It also gave students the opportunity to acquire hands-on technical expertise and get direct feedback from eyeo engineers, he said.

See it to believe it

The six-week challenge ran from March to May 2023 and demonstrated how using generative AI could make ad filtering less onerous.

For instance, one student used OpenAI’s APIs to tweak the GPT-3 Ada model, producing an effective model that created automated webpage-based filter rules.

The student’s successful model opened eyeo’s eyes to generative AI’s real-world applications.

What students lack in experience, they often make up for in curiosity and fresh perspective.

But the challenge’s biggest boon for eyeo wasn’t actually the model itself, according to Minhas. It was the data set eyeo created for students to work with during the challenge, which contained more than 1 million ads.

To generate the data set for the challenge, eyeo first looked at open-source lists, like Alexa and Similarweb, to find the most-visited domains in different regions, according to Minhas. Then it gathered information from the HTML of those sites, including the headings, organic content and, crucially for eyeo’s purposes, the ads on the page.

Challenge participants worked with two data sets to develop their AI models: a training set and a test set. The training set included 100,000 pairs of webpages and corresponding filter rules, while the test set held 36,000 webpages.

By dividing data into two sets, students could check how well their models generalized data they hadn’t seen before, Wölkhart said, and the test set allowed eyeo to judge how closely the student-produced models matched the filters from the training set.

But the process also simulated how companies test model performance before rolling out a model to users – an important step, Wölkhart said, since “one wrong filter rule could break a whole website for thousands of users.”

Eyes ahead

Following the Moonshot project, eyeo went on to use the data set it generated for the challenge to train another model involving URL parameters.

Eyeo tested using URL parameters to detect if a certain portion of a page has ads or not, Minhas said.

URL parameters are query strings that append additional information to a basic web address to pass to a server, track ad campaigns and customize user experiences on a website.

This new URL-based classifier model achieved precision – a machine learning performance metric that measures a model’s accuracy – comparable to the solutions eyeo already has in production.

The company is working on a proof of concept and has identified use cases for the new model, such as automatically detecting buggy ad filters. “It has generalization ability that allows us to detect ads in unknown domains and provide a better user experience in ad filtering,” Minhas said.

Next up, eyeo has been researching how to detect ads served in AI chatbot experiences and filter them out if a user doesn’t want to see them anymore.

“The data set was a huge step for us to take our research forward,” Minhas said.

Must Read

Comic: Header Bidding Rapper (Wrapper!)

Outgoing Prebid President Mike Racic On His Departure And The Org’s Next Act

Prebid is turning the page on what might be called its second chapter as the organization navigates some major changes in the digital advertising landscape and within its own ranks.

Meta is giving advertisers the ability to connect their third-party analytics tools directly to its ad platform via API.

How Apparel Brand Tuckernuck Devised The 'Why' Behind Its CTV Ad Performance

Performance CTV tech company Keynes launched an AI-powered platform. Tuckernuck says it can finally “pop open the hood” and see what’s working.

Salt Lake City, Utah, U.S.A. - February 24th 2021: Martinelli Gold Medal Sparkling Blush for festive occasions and gatherings. Fermented Apple Cider from the state of California.

How Juice Brand Martinelli’s Gets To The Core Of Retail Media Incrementality

ROAS who? Martinelli’s is testing how crisp its retail media spend really is by using a new metric called incremental ROAS.

Privacy! Commerce! Connected TV! Read all about it. Subscribe to AdExchanger Newsletters
A scale with the letters AI on one side and a pencil and ruler on the other. The pencil and ruler represent the concept of measurement and precision

Measured Has A New Tool That Lets Marketers Chat With Their Incrementality Data

Media measurement provider Measured launched an MCP integration that allows brands to ask ChatGPT, Claude, Gemini and other AI platforms how their media is performing.

Roku Revamps Its Home Screen To Appease Both Consumers And Advertisers

Roku unveiled its new home screen, which includes new features designed to further personalize the home screen experience for each viewer.

Why Critics Say Email-Based IDs Don’t Work For CTV

Email targeting in CTV has a credibility problem as buyers and sellers question whether one-to-one identity even fits a channel built for broader reach.