Using Machine Learning To Break The Cycle Of Bias

“Data-Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media.

Today’s column is written by Taejin In, vice president of product management at Dstillery.

Within the field of machine learning, there are two approaches – supervised and unsupervised learning – that advertisers must familiarize themselves with. Knowing the differences and advantages of both approaches could make a huge difference to a brand’s advertising campaigns and bottom line, if leveraged correctly.

Beyond driving ROI, misunderstanding these approaches could lead to unintended cultural and social impact. In today’s heated political landscape, it is absolutely critical that brands and advertisers learn about the potential biases built into technology and do everything they can to avoid perpetuating cultural and racial biases.

Supervised vs. unsupervised learning

Part of the reason bias gets introduced into advertising campaigns is that brands and agencies often start their campaigns with the end outcome in mind. For decades, the world of surveys, focus groups and market research revealed to brands, “This is your audience.” Brands then took that research to their agency and tasked media buyers with finding the audience for targeting.

In supervised learning, the outcome is already established and models are then generated to help drive toward that outcome. For example, let’s say a brand wants to reach a surfer audience. A model is fed signals such as past surfboard purchases or surfing website browsing as an outcome, and then devices that exhibit those behaviors are identified to receive the ad messages.

With unsupervised machine learning, the outcome is left undefined and a learning algorithm is deployed to find patterns and structures in raw data. In some experiments I’ve seen, clustering – a common form of unsupervised learning – was used to determine website co-visitation patterns. Upon completion, very large clusters of audiences were identified that were not intuitive to the experimenter at first. This included things like political groupings that span the full left to right spectrum and websites frequently visited by underrepresented ethnic groups.

Combating bias

There are two primary issues at play here. The first issue is related to who does the supervising in supervised machine learning. The advertising industry is overwhelmingly white, according to the ANA, with African African Americans holding only 4% of senior level jobs, and Hispanics holding 9% of those jobs. There is an outdated perception that ad tech founders are white, young or young-ish dudes with a technical degree from select universities, with the teams developing AI and machine learning also being overwhelmingly white and male. That means that the technology, taxonomy and products may be built in the image of their makers. For every surfer or auto-intender audience, there is likely a subset of the brand’s customers that is being neglected or ignored due to the inherent bias, intentional or not, in those creating the models.

The second issue is that the reliance on supervised learning means advertisers are missing out on relevant audiences that might only be identified by unsupervised learning. If brands aren’t identifying and speaking to their entire customer base because of bias in models, there’s a good chance they’re missing out on a large consumer group.

Ad tech isn’t alone in this issue: Studies show that many uses of AI can have racial bias. Even fundamental technologies like photography have evolved over time so that their technical elements favor white skin tones. In advertising, the advancements in unsupervised machine learning for targeting could help accelerate a much needed correction, moving away from this bias. In fact, advertising could be one of the first real business ecosystems that eliminates these biases, not just because it’s the right thing to do, but because it’s financially beneficial for the brands.

As advertisers become more familiarized with the applications of AI and machine learning, they need to understand the nuances within these emerging fields and probe to ensure that the solutions they deploy are free from potential biases. A failure to do so only perpetuates the status quo, underserving and underrepresenting what could be valuable new audiences for advertisers. As an industry, we need to continue to push for more varied voices so we can cast a more even net and drive better business results in the process.

Follow Dstillery (@Dstillery) and AdExchanger (@adexchanger) on Twitter.