September 2, 2013

When Heuristics go Bad – Dealing with Common Optimization Practices – Part 2

My first trip through the common heuristics of conversion rate optimization looked at two of the more common testing ideas and how they usually reach false or limiting conclusions. In my second part I want to look at general testing theory best practices and how they can be major limiting factors in the success of your program.

It is important to remember that you are always going to get an outcome so this is not about can you make money. How you and the people in your organization think about testing is the largest factor in what you value that optimization produces. This is an evaluation of the efficiency of the method and how much does it produce for the same or less resources. In concept you can spend infinite amount of resources to achieve any end goal, but the reality is that we are always faced with a finite amount of time and population, which means we must always be looking for ways to improve inefficient systems. If we continue to be limited by these common heuristics then the industry as a whole will continue to produce minimal results compared to what it can and should be producing.

Always have a Hypothesis –

There is not more misunderstood term then hypothesis. In all likelihood it is because most are familiar only with their 6th grade (at least in my school) science instruction or they took classroom formal science in college. In those fields we operate like we have unlimited time and resources and we are trying to validate whether a drug will cause cancer, not whether a banner will get more clicks if it is blue or red. The stakes are higher and the models are much more simple in classroom controlled studies for cancer. There is a lot to scientific method, especially when approached from a resource efficiency perspective that is not considered in such a simplistic view of idea validation.

We must apply scientific rigor, but we must also make sure that all actions make sense in real world situations, which means that efficiency and minimizing regret are more important than validation of an individual’s opinion. It is not that scientific method relies on the use of a hypothesis, it is simply that we mistake a hypothesis with a correct hypothesis; we seek validation for our opinions and not the discovery of the best way to proceed. Science is also about proving one idea versus all other alternative hypothesis yet we ignore that part of the discipline because it is not the part that allows someone to see if they are right. In the grand scheme of things we are drastically over valuing test ideas and that is distracting from the parts of the process that provide value.

Let’s start with the basics. You should never, and I mean never, run a test if you do not have a single success metric for your entire site. In most cases this is to make more money, but whatever it is, this goal exists outside of the concept of the test. You must also must have rigid measurement and action rules that are reproducible, which means that you must understand real world situations like the limitations of confidence and variance.

You can then have an opinion about what you think will happen when you make a change. The problem is when we confuse that opinion with the measured goals of the test. Even worse we limit what we compare resulting in massively inefficient use of your time and effort. Just because you believe that improving your navigation will get people to spend more time on your site, that is completely irrelevant to the end goal of making more money. Your belief that more engagement will result in more revenue is not enough to make it so. If you are right AND if that also produces more revenue, then you will know that from revenue. If you are wrong you will only know that from revenue. We must construct our actions to produce answers to our opinion and to what is best for our organization. Hypothesis and ideas are just a very small part of a much more complex and important picture, and over focus on them allows people to avoid the responsibility and the benefit on focusing on all those other parts, which are the ones that really make a difference over time for any and all testing programs.

The worst factor of this is that it allows people to fall for congruence bias and to fail to ask the right questions. We become so used to the conversation around a single idea that the concept of discovery and challenging assumptions is more word then action. Questions can be incredibly important to the success of a program, but only if they are tackled in the right order and used to focus attention, not as the final validation of spent attention. If your hypothesis is that a certain navigation change will result in more engagement, then the correct use of your resources are either which of a number of different versions of the navigation will produce the most revenue or if you can, which section on your site produces the most engagement when changed. In both cases you have adapted your “hypothesis” to present a more efficient and functional use of your time. The hypothesis exists, but it is not the constraint of the test. If you are right, you will see it. If you are wrong, you will make more money.

This means that having a hypothesis is important, but only if it is not the test charter. Have an idea what you are trying to accomplish and make sure that you go about seeing the value of certain actions compared to each other is more important. Sometimes the most effective hypothesis are “I believe that we do not know the value of different sections on our pages.” Don’t confuse your opinion on what will win with a successful test. Challenge assumptions and design efforts to maximize what you can do with what you have and you will never be without opinions. The best answers are always when you are proven wrong, but if you get too caught up on validating your hypothesis, then you will always be missing the largest lessons you could be learning.

We need to optimize X because it is losing Y

This is the classic problem of confusing rate and value, or more correctly correlative and causal inference. We confuse what we want to happen with what is really happening. Just because people were doing X and now they are doing Y, it doesn’t mean that this is directly causing any change, positive or negative to our end goals. Outside of the three rules of proving causation the real issue here is that we get tied to our beliefs about a pattern of events even when the data cannot possibly validate that conclusion. Understanding and acting on what you know as opposed to what you want to have happen is the difference between being data driven and simply being data justified.

Think about it this way, I have 23% clicks on one section of my page and 0% on another. If I were to improve one of those which one is going to produce the biggest returns? The answer here is that you do not know. A rate of interaction cannot possibly tell you the value of changing that item. Some of the most important parts of any user experience are things that can’t even be clicked.

This plays out outside of clicks too. We have a product funnel and we see more people leaving on page 3, therefore we need to test on page 3. The reality is that more or less people may or may not be tied to more or less revenue. Even if it is tied it may be a qualification issue higher, or a user interaction issue, or simply too many people in a prior step. This is called a linear assumption fallacy, where we assume that when we have 5 people and 2 convert that if we have 10 people 4 will convert. Linear models are rare in nature but are easy to understand, so we fall back on comfort over realistic understanding.

The act of figuring out what to test can be difficult but it is never improved by pretending we have validation of our own ideas when we have nothing to justify them. We need to be open to discovering where we should go and to focus on some set path. In almost all cases you will find that you are wrong, often dramatically so, about where problems really are and how to fix them. This is why it is so important to not try and focus solely on more or less correlative actions. We can and should be able to test fast enough and with few enough resources that we will never be limited to this realm unless we can are stuck there mentally.

Like so much else what you spend your time and effort on is incredibly important. There are a thousand things you can improve and there are always new ideas. Justifying them falsely or focusing on them instead of the discipline of testing is nothing but a drag on your entire testing program. Test ideation is about 1% of the value derived from a test program yet it is 90%+ of where people like to spend their time. A 5% gain that took 2 months is worth a lot less than a 10% gain that took 2 weeks. The most important issues we must face are not about generating test ideas or validating our beliefs about how to improve our site, it is about discovering and applying resources to make sure that we are doing the 10% option and not the 5% option. If we overly focus on test ideas and not the discipline of applying them correctly we are never going to going to achieve what should be achieved. If we get lost trying to focus only on where we want to go, then you will always be limited in the possible outcomes you can generate.

TL;DR

When Heuristics go Bad – Dealing with Common Optimization Practices – Part 2

Join the Discussion Cancel reply

Share this:

Related

Join the Discussion Cancel reply