Also known as “multi-armed bandit testing”, the name is derived from the behavior of casino slot machine players who often play several machines at once in order to optimize their payout.
Rather than stay with a single machine, the gambler will often play some percentage of the time on several other nearby machines. In this way, the new “hot” machine can be identified without leaving the original machine behind.
When used in website testing, bandit testing represents a paradigm shift from conventional A/B testing:
- In an A/B test, two options are tested head to head in equal percentages (50/50), and only the best performing of the options continues to be presented to website visitors after the results are analyzed.
- In a bandit test, three or more options are tested simultaneously, and more traffic is diverted to the most successful option at a given point in time, while continuing to run and evaluate other options for a smaller percentage of the traffic.
In the example below, the first three weeks of the A/B test are dedicated to exploration, while the next three weeks are dedicated to exploitation of the findings.
Running the test as a “bandit” allows you to explore and exploit simultaneously, and shift traffic to maximize conversions while continuing to test other options.
For example, let’s say Option A in the bandit test was performing best during the first week of testing. You (or the algorithm) might then shift more traffic to Option A while continuing to gather more data for B and C in the background, although at lesser percentages.
Conversely, if you begin to see a significant positive shift in conversion percentage for Option C, the bandit method has the flexibility to re-adjust your traffic accordingly. Option C then takes over the lions’ share of traffic, based on the recent trend, with the other options relegated to the background, yet still providing data.
You might even decide to replace a poorly performing option with a new one (Option D) during week 6 (and perhaps beyond), as shown in the example, then continue to test for a longer period of time.
Numerous variations of bandit testing algorithms allow you to modify what percentage of traffic is diverted to the highest performing option, how often the mix of options changes, how many options are tested at once and how much risk is taken with respect to confidence in the data, i.e. higher or lower sample size, etc.
Bandit testing typically works best in situations that meet the following criteria:
- High volume of traffic
- Site or feature will be short lived (such as a Holiday sale)
- Testing can be automated