With the advent of ExactTarget releasing an A/B testing suite and Google Analytics providing the same functionality for websites I thought it would be beneficial to go behind the scenes of what A/B testing really is. Like many real world processes, A/B testing grounds itself in the world of mathematics and more specifically statistics. Don’t get scared away already though, there is hope for all. Many applications simplify the testing process to the point that a marketer, designer, or developer simply selects which is performing better. But what do we mean by performing better? And, why would I want to understand it?
A Quick Analogy For The Faint of Heart
Let’s say we’re outside Grauman’s Chinese Theatre in LA and it is a nice day with a lot out on the street checking out this historical site. However, we don’t have enough money to buy tickets to see a movie, therefore, we end up having to stand across the street while waiting to catch a ride from a friend. Meanwhile, we are bored and begin to people watch. Then I pose this question, “What percentage of those people do you think are going to buy tickets?” One might immediately think to simply go to the ticket booth and ask how many tickets have been sold, but remember we can’t leave our position across the street. I wouldn’t make it that simple. So what’s the solution? Although tedious, the anser is to count those who pass by the theater and those leaving through the exit of the theater. Then you could easily do a percentage of how many people bought tickets. But what does this have to do with my website or email?
The Explanation, Don’t Worry No Math Yet
A website or email acts very much like the theater in our analogy. Visitors to the theater also have the same intuition/personality traits as a vistor to a website. In both scenarios, the visitor is looking for something that is going to draw them in and have them make a decision. However, as our analogy shows knowing exactly how many people are going to make a decision is impossible to know ahead of time. Therefore, the best you can do is estimate the conversion rate based on who leaves (passing by and through the exit of) the theater and likewise, who clicks certain links in your website or email.
Defining A Conversion
A conversion in it’s most basic form can be thought of as a success or failure. A conversion rate in mathematical terms is call a binomial random variable, which means it has to possible outcomes: true or false. It is represented as the variable p. The number of visits to your website is represented by the variable n also known as trials. After n visits we can calculate how many of those resulted in a conversion. Now each day (or any time interval) can be seen as a set of trials or an experiment. Since we live in a world of random occurrences we cannot expect that we will get the same value of p for every experiment. This randomization will give you a range for your conversation rate (we’ll need this later).
Great, We’ve Run A Few Experiments But Now What?
As we increase the number of experiments our results will get more accurate, but lets say we don’t want to do thousands of experiments, how can we determine the conversion rate? There is a concept, widely used in statistics, known as standard error, which tells how much deviation there is from the average conversion rate p if the experiment is repeated multiple times. The smaller deviation, the more confident you can be in your results. Here is how we calculate the standard error:
Standard Error (SE) = Square root of (p*(1-p) /n )
Where p is an average of the individual trials:
p = ( p1 + p2 + p3… pn) / n
In order to get a 95% confidence interval, the range for which the estimate is reliable, multiply the standard error by 2. This brings us to:
p % ± 2 * SE
Which means your true conversion rate lies between p (plus or minus) two times the standard error.
Making A Comparison
Beyond calculating the conversion rate of website, we can also calculate the range of its variation in an A/B test. We have already established that (with 95% confidence) that true conversion lives within that range, all we must do is observe the overlap between conversion rate of a website or email (control) and its variation. Or, as our movie theater analogy goes: one billboard placement versus another. If there is no overlap, the variation is better if the conversion rate is higher and worse if the conversion rate is lower. Its just that simple. Here is an example:
Suppose control conversion rate has a range of 20.5% ± 1.5% and a variation has range of 12% ± 1%. In this case, there is no overlap and you can be sure about the reliability of results. Here, the control has the better conversion rate.
What Have We Learned?
We’ve learned that conversion rate is in reality of an estimate and based up conversion rate and standard of error we can compare our control to our variations in a quantifiable manner. This comparison will yield our performance, which is really what we are after. If you couldn’t exactly follow everything that is quite okay, because these calculations become quite tedious by hand and it is recommended that you use A/B test calculator to save you the trouble. Hopefully, this article has shed some light onto A/B testing and that you’ll understand a little more what is going on when you are using your A/B testing software.