Home Data-Driven Thinking A Big Data Truth: It’s All Relative

A Big Data Truth: It’s All Relative

SHARE:

susanzhangData-Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media.

Today’s column is written by Susan Zhang, data scientist at PlaceIQ.

In the early 1900s, David Hilbert set out to prove the consistency of mathematics by reducing all mathematical statements into a formal language, from which we could deduce all mathematical statements.

Hilbert believe that derived statements would be consistent with one another. There would be no method of derivation in which we can obtain, from the same set of axioms, “1 + 1 = 2” in one case and  “1 + 1 ≠ 2” in another.

One hundred years later, we sit on more data points about human behavior than ever. Data-driven is the go-to phrase for making decisions using statistical inference and complex computations. In digital marketing, utilizing these data points can help drive consumer outreach, illustrate trends in consumer behavior and shed light on patterns that would have otherwise gone unnoticed.

The ways in which we choose to use this data can vary tremendously. How then can we choose the best model?

A Search For Truth

In order to determine which method yields better results, some metric of measurement is needed from which error can be minimized. Unfortunately, these “true sets” or “true values” are not necessarily present or obvious.

Take, for example, the task of describing everyday human behaviors.

Do people who shop at one grocery store also frequent the nearby fast food chain? Do people with a higher income behave differently than the unemployed? In each case, the point of the investigation is to determine the “truth set” – what people are actually doing, how they should be classified and what this classification implies about the state of the world.

Subscribe

AdExchanger Daily

Get our editors’ roundup delivered to your inbox every weekday.

Sure, we can create our own target sets with predefined socioeconomic biases, but then our algorithms would merely strive to confirm such biases within the entire population, not develop them independently from the raw data itself.

Gödel’s Second Theorem

In 1931, Kurt Gödel published two incompleteness theorems establishing the impossibility of Hilbert’s claims. His second incompleteness theorem can be paraphrased:

Given a set of axioms and all statements derived from these axioms, there cannot exist a statement within this set that proves the consistency of this system. If such statement exists, then this system is inconsistent.

You can almost think of this like defining a word in the dictionary using the word itself: The self-referential nature negates the explanation.

The idea behind Gödel’s second incompleteness theorem closely mimics the limitations seen in the task of defining human behavior. We need some “truth set” on which to base an algorithm, but at the same time, any method used to obtain an audience’s true behavior, which simultaneously proves its own consistency, would violate Gödel’s theorem.

Relative Consistency

While there may not be a method of deriving the absolute state of the world and knowing its degree of consistency, there is a way we can build ourselves up, layer by layer, using relative consistency.

Take, for example, people who own cars. Suppose we have a data set where 10% of the population consists of 18- to 23-year-olds. Our car ownership algorithm determines that 2% of all car owners are 18 to 23 years old.

This makes sense since young adults may be less capable of buying a car than older adults. The 2% number, when compared to the 10% number, appears accurate. But if the algorithm determined that 80% of all car owners are 18 to 23 years old, we would have a problem. The 80% number, when compared to the 10% number, does not appear to be anywhere near accurate.

In this case, the inconsistency in the results points to a potentially flawed algorithm or a corrupted input data set that is not representative of the true population. A check for the relative consistency of the results would tell us where a problem might exist, and prevent us from further iterations on a flawed algorithm and data set.

Like the processes of quality assurance in a manufacturing plant and ongoing maintenance for the structural base of a skyscraper, these consistency checks are fundamental to the iterative process of extracting meaning from big data. While we rely on complex algorithms to augment human intelligence and intuition, we must also question the integrity of the algorithms themselves to ensure that inconsistencies are rooted out as early as possible.

Gödel’s theorems may only be applicable in a particularly esoteric branch of mathematics, but they still illustrate a lesson that we can all benefit from: It is better to iterate with relative consistency than to settle for inconsistent systems.

Follow PlaceIQ (@PlaceIQ) and AdExchanger (@adexchanger) on Twitter.

Must Read

Jamie Seltzer, global chief data and technology officer, Havas Media Network, speaks to AdExchanger at CES 2026.

CES 2026: What’s Real – And What’s BS – When It Comes To AI

Ad industry experts call out trends to watch in 2026 and separate the real AI use cases having an impact today from the AI hype they heard at CES.

New Startup Pinch AI Tackles The Growing Problem Of Ecommerce Return Scams

Fraud is eating into retail profits. A new startup called Pinch AI just launched with $5 million in funding to fight back.

Comic: Shopper Marketing Data

CPG Data Seller SPINS Moves Into Media With MikMak Acquisition

On Wednesday, retail and CPG data company SPINS added a new piece with its acquisition of MikMak, a click-to-buy ad tech and analytics startup that helps optimize their commerce media.

Privacy! Commerce! Connected TV! Read all about it. Subscribe to AdExchanger Newsletters

How Valvoline Shifted Marketing Gears When It Became A Pure-Play Retail Brand

Believe it or not, car oil change service company Valvoline is in the midst of a fascinating retail marketing transformation.

AdExchanger's Big Story podcast with journalistic insights on advertising, marketing and ad tech

The Big Story: Live From CES 2026

Agents, streamers and robots, oh my! Live from the C-Space campus at the Aria Casino in Las Vegas, our team breaks down the most interesting ad tech trends we saw at CES this year.

Monopoly Man looks on at the DOJ vs. Google ad tech antitrust trial (comic).

2025: The Year Google Lost In Court And Won Anyway

From afar, it looks like Google had a rough year in antitrust court. But zoom in a bit and it becomes clear that the past year went about as well as Google could have hoped for.