Holdout Test

TL;DR

A holdout test withholds advertising from a randomly selected subset of users (typically 10–20%) to measure what conversion rate looks like without the ad — giving you the control group needed to calculate true incremental lift. Without a holdout, you're measuring correlation between ad exposure and conversion, not causation.

What is a holdout test?

A holdout test (also called a ghost ad experiment, conversion holdout, or platform lift study) is a controlled experiment where a random subset of the target audience is withheld from seeing an ad — the holdout group — while the rest of the audience continues to receive the campaign normally. The difference in conversion rate between the exposed and holdout groups is the incremental lift.

Holdout tests are the most rigorous method for measuring whether a specific ad campaign is causing conversions or merely correlating with them. The holdout group serves as the counterfactual: what would conversion rate look like without the ad? Most ad platforms (Meta, Google) offer built-in holdout or conversion lift study tools that automate this at scale.

Holdout tests are most valuable for retargeting campaigns, where organic conversion intent is highest and attribution inflation is most severe. A retargeting holdout on a D2C brand often reveals that 40–70% of attributed conversions would have occurred organically — customers who were already planning to repurchase. (See conversion lift for the detailed measurement framework.)

Why holdout tests matter for operators

The cost of not running holdout tests is systematic budget waste on low-incrementality channels. Operators who allocate spend based purely on last-click ROAS end up over-investing in retargeting (which scores well on attribution but drives minimal incremental revenue) and under-investing in prospecting (which scores poorly on last-click but drives new customers).

The scale of attribution inflation is larger than most operators expect. Industry analysis of holdout tests across D2C brands consistently shows that 30–60% of revenue attributed to retargeting campaigns is organic. On a $50K/month retargeting budget, that represents $15K–$30K/month spent on conversions that would have happened without the ad.

For B2B SaaS operators, holdout tests on nurture email sequences reveal whether the sequence is accelerating conversion or merely observing it. A mid-funnel email sequence that appears to generate 35% of conversions in attribution reporting might produce only 8% incremental lift in a holdout — the leads were converting anyway.

How to run a holdout test

Holdout Test Structure:

1. Define holdout percentage
   — Standard: 10–20% of target audience held out
   — Larger holdout = more statistical power; smaller = more ad reach
   — Rule of thumb: use the smallest holdout that achieves target power

2. Randomise assignment (platform-level or cookie-level)
   — Meta Conversion Lift: automated holdout in Ads Manager
   — Google Brand Lift: automated for YouTube/Display campaigns
   — Manual: use a 10% random sample of user IDs, exclude from targeting

3. Run for full conversion window
   — Define: the number of days after ad exposure where conversions count
   — B2B SaaS: 30–90 days
   — D2C: 7–30 days

4. Calculate lift
   Exposed group conversion rate: 3.4%  (n = 90,000)
   Holdout group conversion rate:  2.1%  (n = 10,000)

   Lift = ((3.4% − 2.1%) / 2.1%) × 100 = 61.9%

   Incremental CPA:
     Ad spend: $22,000
     Incremental conversions = 90,000 × (3.4% − 2.1%) = 1,170
     Incremental CPA = $22,000 / 1,170 = $18.80

Holdout test benchmarks by campaign type

Campaign type	Typical holdout size	Typical lift	Attribution inflation
D2C retargeting (existing customers)	15–20%	5–18%	50–70% of attributed revenue is organic
D2C prospecting (new audiences)	10–15%	25–55%	15–30% attribution inflation
B2B SaaS retargeting	10–15%	8–22%	30–50% of attributed pipeline is organic
B2B SaaS nurture email sequence	10%	5–15%	40–60% would have converted without the email
B2B SaaS brand awareness	15–20%	10–25%	Highly variable; geo-lift test often more reliable

Sources: Meta Conversion Lift Studies 2024; Google Brand Lift Studies 2024; Pavilion Operator Survey 2024; Fairview customer data.

Common mistakes when running holdout tests

1. Running a holdout that's too small for statistical significance. A 2% holdout on a campaign with 500 conversions per month generates only 10 holdout conversions — far too few to detect any meaningful lift difference. Use a power calculator. Most holdout tests require at least 1,000 conversions in the holdout group to be reliable.

2. Measuring too early in the conversion window. If you measure a holdout test after 7 days on a product with a 45-day average sales cycle, you're measuring only the fastest-converting leads. The full incremental impact won't be visible until after the full conversion window has elapsed.

3. Running concurrent promotions that confound the result. If you run a holiday sale during a holdout test, both groups respond to the sale signal, not just the ad. Holdout tests should be run during neutral periods with no major promotional events.

4. Not using the results to update budget allocation. Holdout tests are worth nothing if the results sit in a report that nobody acts on. The output — incremental CPA per channel — should directly update the channel budget allocation in the next planning cycle.

5. Conflating holdout tests with geo-lift tests. Holdout tests randomise at the user or cookie level on a single platform. Geo-lift tests randomise at the geographic-market level across channels. For single-platform direct-response campaigns, holdout tests are more precise. For cross-channel or offline campaigns, geo-lift tests are more practical.

How Fairview connects holdout results to channel decisions

Fairview's Margin Intelligence module connects holdout test results from Meta and Google to channel-level spend and contribution margin. When a holdout reveals that retargeting lift is 12%, that percentage updates the channel's effective ROAS — the correct number to use for budget allocation decisions.

The Next-Best Action Engine flags when holdout testing is overdue: "Retargeting ROAS has been above 5× for 60+ days with no incrementality test on record. Attribution inflation is likely. Recommend running a 15% holdout on the retargeting campaign for 21 days to validate."

Companies using Fairview that run holdout tests and input results typically reallocate 15–25% of paid budget from low-lift retargeting to higher-lift prospecting within the following quarter.

→ See how Margin Intelligence tracks true channel ROI

At a glance

Category: Marketing Metrics
Related: 5 terms

Frequently asked questions

What is a holdout test in simple terms?

An experiment where you deliberately stop showing your ad to a random 10–20% of your target audience, then compare conversion rates between the people who saw the ad and those who didn't. The difference is the incremental lift your ad is actually causing.

How is a holdout test different from a geo-lift test?

A holdout test randomises at the user or cookie level on a single platform — best for digital-only campaigns. A geo-lift test randomises at the geographic-market level — better for cross-channel, offline, or TV/podcast campaigns that can't target individual users. Both measure incrementality; the method differs based on what the channel allows.

What holdout size should you use?

Start with 15–20% for most campaigns. Use a statistical power calculator to verify that size will generate enough holdout conversions for significance. For campaigns with very high conversion volume, 10% is sufficient. For low-volume campaigns, you may need 20–25% to get enough data.

Does every campaign need a holdout test?

No — holdout tests are most valuable for retargeting and nurture sequences, where organic intent is high and attribution inflation is most severe. Prospecting campaigns generally show genuine lift and don't need holdout validation as frequently. Prioritise holdouts for your 3–5 largest retargeting or remarketing budgets.

How do you act on holdout test results?

Calculate incremental CPA (ad spend / incremental conversions, not attributed conversions). Compare incremental CPA across all channels. Reduce budget on channels where incremental CPA exceeds target, and increase budget on channels where incremental CPA is below target. Rerun the test quarterly — lift changes as audiences, creative, and competitive intensity shift.

Sources

OpenView SaaS Benchmarks 2025
Pavilion Operator Survey 2024
Common Thread Collective D2C Benchmarks 2025
ProfitWell Research
Fairview customer data (B2B SaaS + D2C, 2025)

Fairview is an operating intelligence platform that connects holdout test results to channel budget decisions — replacing attributed ROAS with incrementality-adjusted ROI. Start your free trial →

Siddharth Gangal is the founder of Fairview. He built the incrementality layer after watching operators scale retargeting spend to $60K/month based on 5× attributed ROAS — then run a holdout test and discover that 55% of those conversions were organic.

See it in Fairview

Track Holdout Test automatically.

14-day free trial. No credit card. First data source connected in 5 minutes.

Start free trial Book a demo