7 Laws of Split Testing That Most People Completely Ignore

by Neil Patel

Last updated on July 27th, 2017

Ah, split testing. We love it. We swear by it. We do it. We tell other people to do it. We learn from it. We read case studies about it.

And sometimes, we completely blow it.

A/B testing is not quite as easy as buying some software, making some changes, and running a test. If it were that easy, more people would be doing it, and optimizing the heck out of their sites.

Clearly, however, most people aren’t optimizing the heck out of their sites. Where’s the disconnect? There are some errors that are really easy to make. Even the pros tend to commit some of these blunders from time to time.

If you aren’t seeing the level of results that you want, or are frustrated with your split test findings, you’d do well to read this article. Making some small yet significant changes can revolutionize your approach to split testing.

Here are the top seven laws of split testing that you might be violating.

1. You must A/A test before you A/B test.

Everyone’s heard of A/B testing, but what about A/A testing?

An A/A test is one in which you test your site or page against itself. No variables. No changes. Just test the page, and test it again.

Why would you do this? Simple. You want to validate the test setup. For example, if you’re going to run an A/B test, you should first run an A/A test to ensure that the timing and dataset of the forthcoming split test is going to be accurate.

Besides, there’s the added benefit of testing the testing software itself. If you encounter major glitches or dataset disconnects in your software, then you’re able to fix the bugs before you go to the work of creating a full split test.

An A/A test is simply a safe method of clearing the runway before the more comprehensive and weighted split test.

Some experienced testers don’t agree that A/A testing is necessary. Why not? Because they’ve found alternate methods of qualifying the data and ensuring the precision of their test.

If you’re like Craig Sullivan and have scads of experience, analytics integration access, cross-browser and device testing, triangulated data analysis, instrumentation observations, and compatibility monitoring, then you might not have to do an A/A test.

But if you’re kind of normal, then go ahead and run an A/A test — comparing two versions of the same thing. If the results of the A/A test are similar, then you’re ready to roll with your split test. If the results are wildly different, you know that you have more prep work to do.

Once the A/A test looks good, you can start your first A/B test.

2. You must test the most important things first.

Marketers like to test the little things…like the impact of button color on CTRs.

performable screenshot

I’m not opposed to testing button color, but this is a prime example of wasted testing.

Why? Because you should be testing the big things — macro conversions.

  • A micro conversion is something like a CTR — moving from one point in the funnel to the next.
  • A macro conversion is something like a purchase — the ultimate conversion goal of the site.

The goal of split testing is to improve conversions, but not all conversions are created equal. I advise marketers to test the big things, the important things, the revenue-generating things.

There’s a time and place for granular analysis of micro conversions and detailed insights, but test the most important things first.

Uri Bar-Joseph has a great word of advice on what to test: “While everything is testable, not everything should be tested. Always use common sense.“

How should you decide what to test? I have four suggestions. Each one is linked to in-depth resources.

3. You must only test one variable at a time.

The more eager the tester, the more likely he or she is to change more than one thing.

If you’re going to run a split test, then you must only test one variable at a time. For example…

  • Only change the headline, not the text color.
  • Only change the button size, not the button copy.
  • Only change the image dimensions, not the actual image itself.

Why is this so important? You want to know exactly what change produced what results.

Split testing is designed to isolate and test only one element at a time.

If you’re going to test multiple elements, then call it a multivariate test, not a split test.

4. You must form a hypothesis.

A hypothesis is your assumption or prediction about how the test will turn out.

If you don’t form a hypothesis before you run a test, then you’ll be left scratching your head as to why it was important and what changes you should make.

Before you run your test, think to yourself, “I’m going to predict that if I change X, then it will have the impact of Y.” Call a winner, then run the test.

When you analyze the results, you can better determine whether the change is advisable or not, how important it is, and the results that you expect to see as a result of making that change.

decision quadrant for testing

Image Source

Unbounce suggests coming up with:

  1. A proposed solution
  2. The results that the solution will help you make.

The better you’re able to form a hypothesis, the more actionable your split test results will become.

5. You must run the test long enough to make it valid.

There’s no such thing as “quick” split testing. Data takes time to build up. You need to know when the test is complete before you end it or draw conclusions.

The biggest mistake the people make is calling the test “statistically significant” even though it’s not. Many forms of testing software will tell you when the test is “statistically significant.” Statistical significance is a percentage confidence level.

So, for example, a 99.9% confidence level would be high, meaning that the test has only a 0.1 percent chance of being inaccurate.

ab test calculator

The problem with statistical significance is that it can be misleading.

A test can be called “statistically significant” by a testing software, but the software might be using a low significance threshold. In software like VWO, you can change the default threshold settings. If you don’t, you could be fooled into thinking that a test is statistically significant, when in reality it’s not.

6. You must not follow others’ split test results.

Following someone else’s split test results is like wearing someone else’s glasses. It tries to correct a problem that you don’t have.

I like reading case studies just like everyone else. It’s fun to see which test won, and find out what optimizers are discovering on their sites.

But guess what. All this data is useless for you.

Why? Because it’s someone else’s test, on someone else’s site, in someone else’s industry, with someone else’s audience, testing someone else’s traffic, on someone else’s product or service. The test result has absolutely no application to you except to satisfying your curiosity.

The only test results you can act on are the results that you gain from running tests on your own site!

Popular testing websites and blogs feature articles with “shocking split test results.” Such articles get great click-throughs, but provide very little actual value.

When such articles explain the test results, they make it seem as if those results are universally applicable. So what do gullible marketers and CROs do? They apply the lesson of the test result to their own website.

This is a huge mistake! The split test case study was just that — a case study, not an unbreakable law that everyone should follow. You can test the same thing on your site, but don’t follow others’ test results without testing.

There’s no one color that always converts higher, or a magic word that produces more conversions. There is only the power of testing your own site.

7. You must act on your test results.

Don’t just test for testing’s sake. Test for the sake of higher conversion rates.

Run every test with a single ultimate goal — improving your conversions. Running tests takes time and resources. To get the highest ROI for your time and effort, make sure that you’re making changes as a result of what you learn.

Tests can only be deemed successful if you implement your results.

This may seem ultra-obvious, but there are plenty of organizations out there that have the resources to run tests, but have no resources available to make changes. Be sure that everyone is on the same page with testing:

  • Executives must understand that testing is the path to higher revenue generation. Once you have their buy-in, you can reach out to them if priorities start to shift away from testing. They will have the authority to keep testing on track.
  • Marketing must understand that testing is science. It’s not just another task to add to a hectic marketing schedule. You can’t rush testing. You have to be thorough, methodical and careful when analyzing your results. Otherwise you’ll hurt the business (or organization) more than help it.
  • Your designers and developers must be available to implement any changes that need to be made from winning tests. It’s very common that these two departments have more work than they can handle – so make sure they are slotted in to make changes.

Conclusion

I believe that split testing is a marketing activity with one of the highest ROIs. By testing, analyzing, and implementing, you can squeeze more conversions from your existing traffic. That’s powerful.

It pains me when I see optimizers squander away this potential by committing these errors. Check your own split testing approach. See if you’re violating any laws, and revise your split testing accordingly.

I think you’ll see some major improvements.

What are some split testing mistakes that you see people make?

One Comment

DON’T MISS OUT

Get updates on new articles, webinars and other opportunities:

Neil Patel

Neil Patel is the co-founder of Crazy Egg and Hello Bar. He helps companies like Amazon, NBC, GM, HP and Viacom grow their revenue.

ONE COMMENT

Comment Policy

Please join the conversation! We like long and thoughtful communication.
Abrupt comments and gibberish will not be approved. Please, only use your real name, not your business name or keywords. We rarely allow links in your comment.
Finally, please use your favorite personal social media profile for the website field.

SPEAK YOUR MIND

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

  1. craig sullivan says:
    October 30, 2015 at 3:36 pm

    Neil,

    The point of my article isn’t to make up for A/A testing – by supplanting this with other methods, and you do my work a disservice, sir, by describing what I said in the way you did.

    I’m also perfectly normal – and my views are also quite commonly held. Suggesting that I’m somehow abnormal because I choose to check that my test work on popular devices and browsers is illogical, unless you only wrote it to say something pithy, rather than something actually useful.

    “triangulated data analysis, instrumentation observations, and compatibility monitoring” – these seem to be your words and not mine. Just wanted to say that your precis of my article wasn’t quite hooking up there for me.

    It is perfectly valid to A/A test your site, test rig or how it works on one or more templates – what it’s kinda silly to do is repeat that for every single test. Or to waste time running A/A tests when you don’t know what you’re doing.

    Besides which, the central tenet of your argument – that doing an A/A test will protect you by spotting bias, is deeply flawed. Without a sufficient sample for the A/A test, you can EASILY see fluctuations that are immense. Almost all A/A testing will show a winner at a high confidence level during the test!

    If my A/B test is broken on iPhone 6 models, the A/A test won’t spot that. If my test is broken in some way that evens out or cancels any bias, the A/A test proves nothing there.

    I admire much of your work Neil but calling me on somehow doing something ‘abnormal’ and using such odd language, when your argument is less than robust, just undermines the other good messages in your article.

    Last time I spoke to you about A/B testing, we were in a taxi to a conference in Bucharest. You tried to tell me what A/B testing was all about during that journey and I had to explain a few things or help refine your interpretation.

    I remember thinking at the time, rather sadly, that I had expected you to hold your opinions less tightly – less dogmatically – especially when someone with more experience took time to patiently explain and help your understanding. The leaders in my work that I have looked up to have always been willing to learn from others around them and in the industry.

    Now I see the same thing from you, and I am not surprised.

    Craig.

Show Me My Heatmap

Really easy to understand the user behavior with the new @CrazyEgg reports! @qubitTV #UX #prodmgmt

Francisco Mingote

@fmingote

What makes people leave your website?