How to A/B Test in Klaviyo Without Wasting Time on Meaningless Results
Most A/B tests are set up wrong — too many variables, too small a sample, declared too early. Here's how to test properly and actually learn something.

Most Shopify brands claim they A/B test their emails. What they actually do is change two things at once, send to a sample too small to mean anything, check results after 45 minutes, declare a winner, and learn nothing.
A/B testing done wrong is worse than not testing at all. It gives you false confidence in decisions based on noise instead of signal. You think you know what works, but you don't — and you make future decisions based on flawed data.
Here's how to A/B test in Klaviyo in a way that actually produces learnings you can use.
The One Rule That Matters Most
Test one variable at a time. One.
If you change the subject line AND the hero image AND the CTA button color in the same test, and Variant B outperforms Variant A, what did you learn? Was it the subject line? The image? The button? You have no idea. The test was useless.
Single-variable testing is slower. It requires patience. But every result you get is actionable because you know exactly what caused the difference.
What to Test (In Priority Order)
Not all variables have equal impact. Test in this order, spending the most time on the variables that move the biggest numbers:
1. Subject Lines (Highest Impact)
The subject line determines whether the email gets opened. A 5% difference in open rate across a 30,000-subscriber send means 1,500 more people seeing your email. That directly translates to more clicks and more revenue.
Subject line tests to run:
- Short vs. long (under 30 characters vs. 40+)
- Question vs. statement
- Specific number vs. no number ("3 new arrivals" vs. "New arrivals just dropped")
- Personalization vs. no personalization (using first name)
- Urgency vs. curiosity
- Emoji vs. no emoji
Run at least 8-10 subject line tests before moving to other variables. Subject lines are where the biggest gains live.
2. Hero Image / Above-the-Fold Content
Once the email is opened, the first thing the subscriber sees determines whether they scroll or bounce. Test:
- Product image vs. lifestyle image
- Single product vs. product grid
- Image with text overlay vs. clean image
- Different product as the hero
3. Call-to-Action
The CTA is where revenue happens. Test:
- Button text ("Shop Now" vs. "Get Yours" vs. "See the Collection")
- Button color (high contrast vs. brand color)
- Button placement (above fold only vs. repeated)
- Button size
4. Offer / Incentive
Test this last because it has margin implications:
- 10% off vs. free shipping
- Discount vs. no discount
- Dollar amount vs. percentage ("Save €15" vs. "Save 10%")
- Urgency framing ("Today only" vs. "This week")
Take our free 2-minute scorecard and get a personalized report showing where your email revenue is leaking.
Take the Free Scorecard →Sample Size: The Number Everyone Gets Wrong
Klaviyo lets you A/B test with any sample size. This is a feature and a trap. Because testing with 200 subscribers per variant tells you almost nothing.
The minimum sample size for a meaningful result is 1,000 subscribers per variant. Below this, the results are too noisy to draw reliable conclusions. Random variation dominates, and you'll see "winners" that are actually just statistical noise.
For Shopify brands with lists under 5,000, this creates a challenge. You might not have enough subscribers to run a statistically meaningful A/B test on a single campaign. The workaround: test across multiple campaigns. Use Subject Line A on Monday's campaign and Subject Line B on Wednesday's campaign to the same segment. After 4-5 rounds, you'll have enough data to see patterns.
For brands with 10,000+ subscribers, Klaviyo's built-in A/B test feature works well. Send Variant A to 25% of the list and Variant B to 25%, wait for results, then send the winner to the remaining 50%.
Timing: When to Declare a Winner
Don't check results after one hour and declare a winner. Email open patterns vary throughout the day. Some subscribers check email in the morning. Others check at lunch. Others in the evening.
The minimum wait time before declaring a winner: 4 hours. This allows enough of your audience to see and interact with the email for the results to stabilize.
For the most reliable results, wait 24 hours. The difference between Variant A and Variant B often narrows or shifts after the initial burst of opens.
Klaviyo's automatic winner selection can be set to 4 hours, which is the minimum we recommend. If you're manually evaluating, wait longer.
How to Read Results
Don't just look at open rates or click rates. Look at the metric that matters for the variable you're testing:
- Subject line test → measure by open rate
- Hero image test → measure by click-through rate
- CTA test → measure by click-to-open rate (clicks ÷ opens)
- Offer test → measure by conversion rate (orders ÷ delivered)
A subject line that gets more opens but fewer clicks might be misleading — it could be attracting curiosity without delivering on the promise. Always check secondary metrics to make sure the "winner" actually performed better end-to-end.
The Documentation Habit That Compounds
Every test result should be documented. Create a simple log:
- Date of test
- Variable tested
- Variant A description
- Variant B description
- Sample size per variant
- Primary metric result (A vs. B)
- Winner
- Confidence level (clear winner vs. marginal)
- Takeaway (one sentence about what you learned)
After 20-30 documented tests, patterns emerge that are specific to your audience. You'll discover things like: your audience opens more with questions in subject lines, product images outperform lifestyle images consistently, free shipping converts better than percentage discounts.
These audience-specific insights are more valuable than any generic best practice. They're earned through your own data, not borrowed from someone else's.
A/B Testing in Flows vs. Campaigns
Campaign A/B tests are straightforward: two variants, one send, measure results.
Flow A/B tests are different and more powerful. In Klaviyo, you can add a conditional split in any flow to randomly route subscribers down two different paths. This creates a persistent test that runs on every subscriber who enters the flow.
Flow tests take longer to accumulate data because entries happen gradually (not all at once like a campaign send). But they're ideal for testing:
- Different email copy in a Welcome Series
- Discount vs. no discount in an Abandoned Checkout flow
- 3-email sequence vs. 4-email sequence
- Different delay timings between emails
Let flow tests run for at least 2-4 weeks (or until you have 500+ subscribers per path) before evaluating.
Common Testing Mistakes
Testing too many things at once. Already covered, but worth repeating. One variable. One test. One learning.
Declaring winners too early. An hour of data isn't enough. Wait at least 4 hours, preferably 24.
Testing with too small a sample. 200 subscribers per variant is noise. Aim for 1,000+.
Not testing at all because the list is "too small." Even with a 3,000-subscriber list, you can test across multiple sends over time. Slow data is better than no data.
Testing low-impact variables first. Don't start with button color. Start with subject lines. The highest-impact variables deserve the most testing attention.
Not documenting results. A test you can't reference later is a test wasted. Log everything.
The Bottom Line
A/B testing isn't a nice-to-have. It's the mechanism that turns email marketing from guesswork into a system that improves over time.
Test one variable at a time. Use proper sample sizes. Wait long enough for results to stabilize. Document everything. Test subject lines first, then work down the priority list.
Every well-run test makes your next campaign a little better. That's how email programs compound from good to exceptional — one data point at a time.

Tsvetan Emil
Klaviyo Email & SMS Specialist