Slashdot highlighted a great post on an alternative method to A/B testing. Those of you in the commerce related world are probably very familiar with A/B testing and I’m willing to bet that you have achieved decent results. However, it has one problem, it takes a lot of time to set it up, review the results, and make your choice. Steve Hanov at Steve Hanov’s blog has a very interesting alternative called the “epsilon-greedy” method. It’s a medium size post and Steve does a great job explaining the approach, why it works, and how it will save you time while achieving the same or better results than the traditional A/B approach.
Like many techniques in machine learning, the simplest strategy is hard to beat. More complicated techniques are worth considering, but they may eke out only a few hundredths of a percentage point of performance. One strategy that has been shown to perform well time after time in practical problems is the epsilon-greedy method. We always keep track of the number of pulls of the lever and the amount of rewards we have received from that lever. 10% of the time, we choose a lever at random. The other 90% of the time, we choose the lever that has the highest expectation of rewards.
def choose(): if math.random() < 0.1: # exploration! # choose a random lever 10% of the time. else: # exploitation! # for each lever, # calculate the expectation of reward. # This is the number of trials of the lever divided by the total reward # given by that lever. # choose the lever with the greatest expectation of reward. # increment the number of times the chosen lever has been played. # store test data in redis, choice in session key, etc.. def reward(choice, amount): # add the reward to the total for the given lever. Go to his post for the entire explanation. It's worth it.