How to Split Test Properly

by | Dec 17, 2020

I recently switched my email sends from an old domain to the Deliverability Dashboard domain.

I wasn’t totally surprised to see the open rates were lower when I sent from the new domain, but that might have been because of the actual email or content or lots of other things, rather than what I suspect, which is that the domain reputation needs to be improved.

So, to find out whether the domain I’m sending from makes a difference, I’m going to conduct a split test.

Here’s the thing, though.

Not all split tests are created equal.

A “truly random” split test, where you just randomly put 50% of your list into one bucket, and the other 50% into another, is likely to snatch defeat from the jaws of victory.

It’s unlikely, but it’s possible that all of your engaged contacts who use Google go into one half of the split, and all your unengaged contacts who use Microsoft go into the other half. Which would skew the results rather spectacularly.

So, in today’s post, I’ll explain how I’m going about my split test to make it as “fair” as possible.

My Approach

The elements that need to be equally distributed for the test to be fair are:

  • Mailbox providers
  • Recency of engagement

So, in an ideal world, we’d need to create something like this:

  Engagement Period
Mailbox
Provider
0-7 days 7-30 days 30-60 days 60-90 days 90-180 days 180-365 days 365+ days
Gsuite A/B Split A/B Split A/B Split A/B Split A/B Split A/B Split A/B Split
Gmail A/B Split A/B Split A/B Split A/B Split A/B Split A/B Split A/B Split
Hotmail A/B Split A/B Split A/B Split A/B Split A/B Split A/B Split A/B Split
Office365 A/B Split A/B Split A/B Split A/B Split A/B Split A/B Split A/B Split
Yahoo A/B Split A/B Split A/B Split A/B Split A/B Split A/B Split A/B Split
AOL A/B Split A/B Split A/B Split A/B Split A/B Split A/B Split A/B Split
etc. A/B Split A/B Split A/B Split A/B Split A/B Split A/B Split A/B Split
etc. A/B Split A/B Split A/B Split A/B Split A/B Split A/B Split A/B Split

What that table is trying to show, is that for each cell in the table that represents a combination of mailbox provider and engagement period, we’d need to take that segment of the audience and create a 50/50 A/B split.

So, in the example above, that would be 56 different segments that would each need their own A/B split to be created… and then possibly combining all the “A” segments together, and then all the “B” segments together. Unless you fancy sending out 112 separate email broadcasts, that is!

My Slightly More Pragmatic Test

Luckily, when I analysed my own list, there were only three “buckets” of mailbox provider that were worth considering:

  1. Google (Gsuite and Gmail together)
  2. Microsoft (Hotmail and Microsoft 365 together)
  3. Everything else

To make my life easier, I also “eat my own dogfood” and only send emails to people who’ve engaged within the last 90 days.

So, I chose to use two engagement “buckets” as follows:

  1. Contacts engaged in the last 30 days
  2. Contacts engaged between 30 and 90 days ago

Rather than that unwieldy 56 segments, this gave me a slightly more manageable 6 segments.

Putting It Into Action

I use Infusionsoft and created the following six segments with the following tags:

  • Google 0-30 days engaged
  • Google 30-90 days engaged
  • Microsoft 0-30 days engaged
  • Microsoft 30-90 days engaged
  • Others 0-30 days engaged
  • Others 30-90 days engaged

I used the Deliverability Defender “domain” tags and “engagement” tags to quickly and easily identify each of the six segments.

I then used a simple script that I wrote 4 or 5 years ago to take each of those segments and randomly assign an “A” or a “B” tag for each of the six segments.

So I’m now sending 12 broadcasts as follows:

Sent from Old Domain Sent from New Domain
Google 0-30 days engaged “A” Google 0-30 days engaged “B”
Google 30-90 days engaged “A” Google 30-90 days engaged “B”
Microsoft 0-30 days engaged “A” Microsoft 0-30 days engaged “B”
Microsoft 30-90 days engaged “A” Microsoft 30-90 days engaged “B”
Others 0-30 days engaged “A” Others 0-30 days engaged “B”
Others 30-90 days engaged “A” Others 30-90 days engaged “B”

It’s probably taken longer to write this blog post than it took to do the actual work in Infusionsoft! But even so, it’s still a lot of work and not something I’ll be doing very often.

Hmmm, I’m a software developer, maybe this needs a new tool to be developed 😉

The Results

I’ve written this blog post prior to sending out the test described here. The email I’m sending links to this page… so, if you want to see the results, come back in a week or two and I’ll write another post describing anything I’ve learned from the results.

I really hope this post has been helpful.

If you decide to do a full-on split test differently after reading it, please get in touch and let me know. I’d love to hear from you!