Home Data-Driven Thinking The High Cost Of Bad Measurement: Why Randomized Geo Experiments Are The Gold Standard

The High Cost Of Bad Measurement: Why Randomized Geo Experiments Are The Gold Standard

SHARE:

The number-one job of a marketer is to invest budget wisely to drive sales. That inherently requires accurately measuring the performance of that spending. 

Yet most advertisers still rely on flawed measurement methods that systematically overstate performance and misallocate resources.

The measurement crisis

Even the smallest Fortune 500 companies generate roughly $10 billion in revenue, meaning they likely spend at least $1 billion on advertising. Whether you’re spending billions or mere millions, the stakes are too high to rely on half-measures when optimizing ROI.

Attribution modeling, matched-market tests, synthetic control methods and other quasi-experimental approaches dominate the measurement landscape despite well-documented limitations, including their being subject to bias and overfitting to pleasing results. 

Scientific guidance is clear: quasi-experiments – where test and control groups aren’t assigned strictly at random – should only be used when randomized controlled trials (RCTs) are infeasible or unethical.

In advertising measurement, RCTs are rarely unethical and almost always feasible. Marketers often choose lesser standards due to a lack of understanding about how inferior those methods are for causal inference, or from misplaced concerns about the perceived cost and complexity of proper experimentation. The real risk isn’t in running robust tests; it’s in wasting money or cutting high-performing channels based on misleading conclusions from bad measurement.

The geographic experiment solution

Geographic experiments using Nielsen’s Designated Market Areas (DMAs) offer a remarkably straightforward and effective way to measure true incremental return on ad spend (iROAS). 

Unlike user-level experiments, which have been compromised by privacy changes and were never as accurate as claimed, geo experiments are independent of media platforms and provide deterministic results without personal data or expensive infrastructure like user ID graphs, tracking pixels or clean rooms.

The method is elegant in its simplicity:

Subscribe

AdExchanger Daily

Get our editors’ roundup delivered to your inbox every weekday.

  • Randomly assign all 210 US DMAs to test and control groups
  • Run your advertising campaign only in the test DMAs
  • Measure sales lift by counting transactions by geography from CRM data or sales panels
  • Calculate iROAS by dividing incremental revenue by campaign cost

This methodology is transparent, replicable, unbiased and explainable. It works across virtually all media channels – TV, digital display, social, search, audio, out-of-home – making it ideal for validating channel performance and calibrating marketing mix models.

Achieving balance and statistical power with DMAs

A common apprehension about using DMAs in randomized experiments is that they may lack statistical power and balance, given that DMAs vary in size and characteristics and there are only 210 of them in the US. 

But in practice, several established techniques can overcome these limitations, achieving fine balance between test and control groups and enabling the detection of minimum effect sizes below 1% with 95% confidence.

Key techniques include:

  • Covariate-constrained randomization (re-randomization): Generates thousands of potential random assignments (e.g., 10,000 draws) and selects one that meets pre-specified balance criteria. This approach offers precise control over balance while preserving the core principle of randomization.
  • Post-stratification and covariate adjustment: Starts with a basic random assignment, then uses statistical adjustment methods like regression or ANCOVA in the analysis phase to correct for any imbalances. This doesn’t alter the design but can recover power lost to imbalance.
  • Multi-armed, stepped cluster randomized trial (CRT): Instead of a single test group, this design staggers multiple test arms over time. Pretreatment periods from test markets can be used to enrich the control group, naturally improving statistical power and balance. For advertisers with large transaction volumes, this enables detection of effects as small as 0.5% or less.

These and other methods are covered in detail in an open-source guide on GitHub, How To Design a Geographic Randomized Controlled Trial,” which includes visual diagrams, step-by-step frameworks and over a dozen annotated Python code snippets.

Real-world impact

In one case study, a Fortune 100 brand used the multi-armed, stepped CRT method to evaluate one of its largest digital channels. The CMO had lost faith in their attribution analytics and was preparing to cut the channel’s budget in half.

The geo experiment revealed that the channel was a substantial driver of new customer acquisitions, primarily through offline sales that attribution models missed. Cutting the spend would have jeopardized over a billion dollars in annual revenue.

Implementation guidelines

To run effective geo experiments:

  • Start with major channels: Validate your largest investments first, where gains matter most.
  • Randomize properly: Avoid cherry-picking; random assignment is essential for valid results.
  • Include all DMAs: Experiments limited to just a few markets lack national generalizability.
  • Normalize for pre-periods: Compare each DMA to its own baseline to control for differences.
  • Predefine decision rules: Decide how results will inform budgets before seeing the data.
  • Consider timing: Most experiments need 4-6 weeks of exposure to detect meaningful effects.

The path forward

As AI increasingly drives marketing decisions, training these systems on accurate performance data is critical. Feeding flawed attribution into AI models will only magnify bias and misallocation.

The carpenter’s adage, “Measure twice, cut once,” applies well here. Before making million-dollar media decisions, invest in measurement that truly captures incremental impact. Your marketing effectiveness – and perhaps your job – may depend on it.

Data-Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media.

Follow Central Control and AdExchanger on LinkedIn.

For more articles featuring Rick Bruner, click here.

Must Read

Netflix Boasts Its Best Ad Sales Quarter Ever (Again)

In a livestreamed presentation to investors on Tuesday, co-CEO Greg Peters shared that Netflix had its “best ad sales quarter ever” in Q3, and more than doubled its upfront commitments for this year.

Comic: No One To Play With

Google Pulls The Plug On Topics, PAAPI And Other Major Privacy Sandbox APIs (As The CMA Says ‘Cheerio’)

Google’s aborted cookie crackdown ends with a quiet CMA sign-off and a sweeping phaseout of Privacy Sandbox technologies, from the Topics API to PAAPI.

The Trade Desk’s Auction Evolutions Bring High Drama To The Prebid Summit

TTD shared new details about OpenAds features that let publishers see for themselves whether it’s running a fair auction. But tension between TTD and Prebid hung over the event.

Privacy! Commerce! Connected TV! Read all about it. Subscribe to AdExchanger Newsletters
Monopoly Man looks on at the DOJ vs. Google ad tech antitrust trial (comic).

How Google Stands In The DOJ’s Ad Tech Antitrust Suit, According To Those Who Tracked The Trial

The remedies phase of the Google antitrust trial concluded last week. And after 11 days in the courtroom, there is a clearer sense of where Judge Leonie Brinkema is focused on, and how that might influence what remedies she put in place.

The Ad Context Protocol Aims To Make Sense Of Agentic Ad Demand

The AI advertising agents will need their own trade group eventually. For now though, a bunch of companies are forming the Ad Context Protocol, or AdCP.

OUTFRONT Is Using Agencies’ AI Enthusiasm To Spur Wider Programmatic OOH Adoption

The desire for a data-driven reinvention of OOH inspired OUTFRONT to create agentic AI tools for executing and measuring OOH campaigns and comparing OOH to other channels.