I’m sure you’ve been in the situation wherein the results of a subject line test were very close. You may have made the decision to go with the subject line that had the slight edge in the test. Would your decision have yielded the greatest outcome?
Here’s a real life example of a recent email campaign that had very close open rates: The test was set up to be deployed 2 days before the full launch with 2 different subject lines, and were each sent to 4,000 email database members. The initial result indicated:
SLA = 14.1% open rate
SLB = 13.9% open rate
Using a significance testing calculator, it was determined that the difference between the results was not significant. In fact, it was only significant at 60%, meaning that 40% of the time, SLB would outperform SLA. This is a factor of having too small of a sample size and a minimal deviation in response rates.
It is at this point that the argument is made to proceed with the subject line with the higher open rate despite the fact that is the difference between the two scores is not statistically relevant. However, we chose to test the theory of statistical significance and split the list evenly to see how the results would really net out with a much larger sample.
The result of the full send was as follows:
SLA = 12.7% open rate
SLB = 13.0% open rate
While these results are only 0.3 percentage points apart, the sample size was greater than 180k for each subject line. In this case, the difference between the results was significant at the 99% confidence level.
So when referring to subject line testing with statistically insignificant results, “Close Enough for Government Work” is not close enough.