Why we do what we do: Separating Fact from Fiction – The Narrative Fallacy

Stories are powerful devices that help get a point across to others. They help us close the distance between abstract thought and the way out brains operate through narrative. They help us add order to events. We can convey very complex ideas and help others understand them with our stories. Even more powerfully, this is how we are wired to understand events and information. But what happens if the story we tell is not the right one? How would you even know? Stories are often far more powerful at bypassing rational decisions then in facilitating them. The human mind is wired to support any conclusion it comes to, even in the face of mounting evidence against our supposition. Nassim Taleb describes this error in logic with what he calls the Narrative Fallacy, or the need for people to create stories, even if we do not have evidence that that story is true or even the best explanation of events.

Taleb’s description is as follows:

The narrative fallacy addresses our limited ability to look at sequences of facts without weaving an explanation into them, or, equivalently, forcing a logical link, an arrow of relationship upon them. Explanations bind facts together. They make them all the more easily remembered; they help them make more sense. Where this propensity can go wrong is when it increases our impression of understanding.

Here is an all too common example of how this plays out in the real world. You run a test and discover the a variant produces a 6% lift in RPV. you also discover that the same variant produces a 5% drop in internal searches. So you tell others that people were obviously finding what they wanted easier, so they didn’t need search and so they were spending more. The data presented does not tell you that, it only tells you that the recipe had a lift for RPV and a drop in search. It doesn’t tell you that search and RPV are related, or why someone spent more, it only presented a single data point for comparative analysis between default and that recipe. Any story you come up with adds nothing to why you should make the decision (it raised RPV) and it can set dangerous precedent for believing that dropping search always results in raising RPV (it might, but a single data point in no way provides any insight into that relationship).

Any set of data can be used to make a story. There doesn’t have to be a connection between the real world and the story we tell, since we are the ones that are filling in the gaps between data points. Randomness and direct cause are confused easily when we start to narrate an action. We love these stories because they make it accessible and easy for someone to hear, first event A happened, then event B, then event C. What happens is our minds instantly race to say, event B happened BECAUSE event A happened. Because event A lead to event B, then naturally event C happened. This might be true, it might not be, but the story we use and tell ourselves grants us the excuse to not understand what really went on. We are eliminating the discovery of the reality of this relationship by granting ourselves the story that fills in those gaps, despite its lack of connection to the real world. This action can completely ignore hundreds of other causes and also rules out the involvement of chance.

The world is a very complex place, and there is almost never as simple an answer as a simple series of events to explain any action, let alone one that would actually be important enough to make a business decision. So why then do we let ourselves fall into this and why do we fall back on stories as a tool to make those decisions? We are not actually adding any real value to the information from these stories, we are simply packaging them in a way to help get across an agenda.

We don’t actually need stories to make decisions, we only need discipline. Often times we find ourselves stuck trying to convey concepts beyond others ability to understand in short period and between many different draws for their attention, but this is not an excuse for us believing the fiction that we narrate. Many people base their entire jobs on their ability to tell these stories, not on their ability to deliver meaningful information or change. The reality is that to make a decision, you simply need the ability to compare numbers and choose the best one. I don’t need to know why variant C was better than B, I simply need to know that it was 5% better. Pattern and anomalies are powerful tools and the analyst best friend, but we can never confuse them with explanations for those events. Often times things happen for very complex and difficult reasons, and while it is nice if we feel that we understand them, it does not change the pattern of events.

One of the main opportunities for groups to grow is to move past this dangerous habit of creating stories, and to instead focus on create disciplined and previously agreed on rules of action in order to enable decisions to be made away from the narrative. This move allows you to stop wasting energy on this discussion and instead use it to think of better more creative opportunities to explore and measure the value of. Any system is only good as the input into it, so start focusing on improving the input, and stop worrying about creating stories for every input into that system.

Because of the difficult nature of this change for some groups, many are turning to more advanced techniques in hope of avoiding this bias. The fundamental goal of machine learning is to remove human interpretation of results, and to instead let an algorithm find the most efficient option. All of these system fail when we lose focus and get back into storytelling, when we let the ego of others dictate an action based on how well they understand the reality of the situation. They also fail when we allow our own biases to make the decisions over the system, instead of letting the system learn and choose the best option. When we free ourselves from storytelling, it allows us the freedom to focus on the other end of that system. We don’t need to worry about acting on the data, or in others understanding it; we can instead focus ours and others energy in trying new things and in feeding the system with more quality input.

Love your stories, and if you need them to get a point across, do not instantly remove them from your arsenal. Just don’t believe that they are conveying anything resembling the cause and effect of the world, and do not let them be the deciding factor in how you view and act on the world. They are color, and they make others feel good, but they add no value to the decisions being made. Be clear with others on how you are to act before you ever get to the storytelling and you will discover that stories are simply color. Every journey is a story, just make sure yours is less fiction and more about making correct decisions.

5 Things You Can Do Today to Make Your Optimization Program Better

For those that read my blogs on a regular basis you will notice that I rarely focus on lists of steps and instead focus on the disciplines and key concepts that dictate making the right choice. The reason for this is that it is easy to get caught up on to-do lists, but knowing what makes a good decision and what makes a bad decision are what will make you successful long term. That being said, it is important from time to time to have a practical list of actions you can take that will dramatically improve your program and that will enable you to get past a lot of the misconceptions and focus on what matters.

Here are 5 actions that you can start doing today that will dramatically shape your program for the better.

1) Choose a single success metric.

This is either the hardest or easiest step for most programs, but it is the single most important thing you can do. Optimizing the wrong metric, while it might make you feel like you accomplished something, does nothing but waste time and effort. If you are optimizing for clicks, bounce rate, dependent metrics (people who clicked on this banner), getting people farther in your site, checking cart entries, or any of a hundred of other misaligned metrics, you are getting little to no value from your program. Never confuse your KPIs from analytics with your testing objective. You cannot assume that just because you get people farther or get more of some action that it magically equals more value for your organization. Stopping all actions until you get this one aligned is fundamental and required for you to provide any value for your organization.

If you are not sure what that metric is, the key is translate action as close to revenue as possible. If you can’t decide, translate every action into a monetary amount, and then only make decisions on RPV. If you are a lead site with equal value per lead, that is the only time conversion rate is acceptable. If you are a media site, score every page (99% of the site scored is the same as 0%) and then use score per visitor. No matter what you choose, the key is that you only look at that metric to make decisions, never get caught reporting on the separate components that produce that value. Never confuse the “goal” of a test with value for the site, only make decisions that help the entire organization, not just your group or your responsibility.

Way too many programs fail or produce fake results because they fail to tackle this problem. That 86% increase in clicks sure sounds nice, but it may be tied to a LOSS in revenue, not a gain, and unless you are trying to lose money, don’t get caught in that trap. A little bit of pain today will save you years of pain during the life of your program. You can get an outcome from your tests without doing this, and often will tell you how great their improvements were, but they have zero idea if those actions actually provided any additional value to the organization as a whole. Just tackling this problem and getting people talking about and aligning on one goal will stop you from wasting time, resources, and allowing others to abuse the program for their political agendas.

If you do nothing else, do this one step.

2) Stop answering every question –

The entire point of aligning on a metric and testing is speed of execution and ensuring that you aren’t getting a false positive result. The longer it takes you to act on something, or the less rational the decision, the lower the value you get. People are always going to want more information; they are always going to want to help their agenda. It is not the testing program’s job to answer every question; it is their job to find the right answer and make sure it is acted on. Never forget that the hardest but most important action you can take is saying no, not yes. Yes makes people happy, no makes people successful.

Agree on rules of action today, and then anytime someone goes off course (and they will), hold everyone accountable and do not give in to their endless questioning. Make sure those rules of action account for all the realities of testing in the real world, not just classroom statistics. If you need to, allow for 3-4 additional metrics, but NEVER make a decision off of those metrics, only use them for future research. Remind people that a single data point neither tells you correlation nor answers why. So much time and effort is wasted on people not understanding what testing can and can’t do. You can get so much more and achieve so much greater results if you just focus on what matters and don’t give in to fear or politics.

3) Educate

There are hundreds of uses of data, almost all of which are futile or pointless. The human mind is wired to use data as selfishly and poorly as possible. Do not let this be the end of the story. Stop reacting to requests and others and change your focus to one of proactive education. Do not get sucked into the context of a current action but instead focus on what they need to know and what poor actions you need to stop.

My suggestion is to start with teaching people about how and why you make a decision (single success metric and rules of action). Follow with what is standard information and how best to use that data (hint, validating things after the fact is not a proper use of information). Other key topics should be correlative versus causal data, cognitive mistakes of data, statistics, and predictive tools of data. You need to make sure these classes cover both what these tools can do, but also just as important how people can abuse these or misunderstand the value of these tools. Don’t confuse teaching people how to get data as the same as how to use data. There are no silver bullets, and tools are only as good as how they are leveraged.

Technology is a lens that either magnifies your strengths or weaknesses, so make sure you are helping others mitigate their weaknesses so they can focus on their strengths. Make it a major focus to be on the offensive on data and make it easier for people to do the right things. It may seem like this is more work, but you will find that doing this will dramatically stop pointless requests and will increase your time available to do what matters.

4) Read

You are not going to start out knowing everything, nor is not knowing an acceptable excuse. There is always more to learn and new and better ways to think about and tackle problems. Make educating yourself a major priority, and make it something you do every day. How can you expect others to understand things better and use information more rationally if you are not far ahead of them on that quest? Do not focus on just industry blogs, most of them are designed to be cotton candy of the mind, there to be fluff and make you feel better today, but mostly empty and of no value long term. I would actually suggest that if your time is short, read anything but what you would normally look at. Reading something that you are already thinking accomplishes nothing but making you feel better. The goal is to use new outlooks on problems as a way to increase your ability to deal with problems. Look beyond your normal focus and find how others use similar tools for success and failure. There are amazingly gifted people out there, but it takes work to find them and sift through all the fluff that dominates most writing.

The start of my suggested reading lists would be:

Moneyball – Don’t just read this and think, cool, they used data, understand the tools they used, understand what they did and DIDN’T do. Understand efficiencies in market and the closed versus open systems for data analysis.

Black Swan – Any book by Nasim Taleb will do, but there is no greater thinker and explanation of the problems of abusing data and not understanding math can do to the world. This is a bit hard for some to pick up, but I cannot stress enough how important this information is to doing the right thing. To me this is the single most important book for modern data usage and how to design a organization to use it correctly.

You Are Not So Smart – This is my favorite cognitive psychology blog, but any of them will do. Another I might suggest is less wrong. Understand how people abuse data in general, why they believe what they do, and how they will react to data that does not confirm their world view is vital, and you will find the same problems present in the people you work with every day.

MIT Sloan Sports Analytics Conference – As much as Moneyball is a good entry to sports and statistics, it has been passed over many times. You can get a good entry into advance uses of statistics by looking at the work done every year at the Sloan Sports conference.

Don’t stop there. Find other areas, and make sure that you read both what you agree with and what you don’t agree with. Remember that just putting something out there does not make it correct, and that often times what sounds the closest to what you want to hear is designed to do just that, not make you better.

5) Stop Caring about What Wins

There is nothing less important than what won in a test. The winning variant is something you don’t control, but making the right system to discover things is what you do control. Never let yourself get caught up in which variant won, focus instead on making sure you are testing correctly. Do you have the chance to prove yourself and others wrong? Do you understand the math of testing beyond a high school level? Do you know how to act on data? Are you challenging assumptions and focusing on learning over proving people right?

If you do those things, then outcomes will always come. If you don’t, then outcomes are random and they are never as valuable as they should be. Just getting a result from a recipe tells you nothing, it is the ability to build context and act on data that matters. Don’t just think your tool or the goodness of others will magically solve this problem. It is ok if others focus on what wins in a test, as long as that is not the end of the conversation or all that is allowed to happen.

So there you have it, 5 steps that you can start doing this moment to make your program stronger. They aren’t easy, but nothing worth doing ever is. It might seem impossible to add these onto your current workload, but the reality is that doing these frees you up from your current workload and makes each action more valuable. Deal with the real problems, tackle the things no one else wants to, and become better at what you do and you will have the ability to make others better as well.

7 Deadly Sins of Testing – Confusing Rate and Value

One of the most difficult principles to understand for many people in our industry is that rate and value is not the same thing. One of the fastest ways for a program to go astray is to confuse one for the other. It is easy for people to understand the need for agreeing on what you are trying to accomplish, or why they need to have leadership. It is even easy to talk about the need for efficiency and that it is ok to be wrong, but yet even when people get past that point, they still consistently miss this critical difference. We so desperately want to explain our value to the company, that we confuse the value of our actions in an attempt justify our actions. This fundamental loss of understand leads to a wide range of poor decisions and bad understand that dramatically limits the positive impact a testing or analytics program can have.

A rate is simply a ratio or a description of actual outcome; it is the same thing as me telling you that I have a $4.23 RPV for a population or that I got 5000 conversions. This is a description of past behavior, and is simply an outcome, not a description of why or how that outcome came to be. Where people lose focus is that the value, or ability to positively or negatively influence that outcome, is not tied to those gross numbers. A description of rate tells you nothing about an individual action, since you are not comparing that outcome, only describing it. Increasing your conversions does not inherently create more revenue, nor does the revenue by itself reflect positive value generated by an action. We measure things by saying we ran a campaign and then we got $3.56, this is not the same as telling you anything about the value of that campaign. Value would be the difference in running that particular campaign versus not doing anything, or running a different campaign. The rate is the end outcome, the value of that action is how much it improved or decreased performance.

People are so conditioned to express their contribution or to explain their value as the outcome of a group. I am responsible for the product page, or SEO, or internal campaigns so therefore I must be the sole reason for the generation of that value. Just because your department or your product produces 10 million dollars, it does not mean that is representative of your value. Value is simply what would have happened if you had done nothing, or if you chose a different route. Value is what we are really talking about in optimization, we are discovering the various tools and options that allow us to influence our current state and improve performance. We have a way to measure the efficiency of different actions and choose actions based off of a rational process instead of opinion and “experience”. The value of an action is the amount it increases or decreases the bottom line, which means your value is the ability to choose the best influencers, and avoid the worst ones. Stop defining actions by the rate and instead think in terms of value and you will completely change your view of the world. The question is never what did it do, but would it have done if I simply stopped to exist, or if we chose any of the other routes available to us. Which one would have generated the highest possible outcome?

So where does this cause people to go astray? The first place is in assuming that past behavior reflects value, instead of the rate of action. People are so used to doing complicated analysis that shows that people who click on section Y are worth $3.45. What they are missing is that you are expressing the rate of revenue from people that click on that section, not the value of the section. Using correlative information it is impossible to know what that is really influencing, is it positive or negative? Would you make more or less from having it? Or what about other alternatives? It is a fundamental shift in how you view the world, not focusing on what was, but only focusing on the influence and cost of changes. Getting caught on this definition often leads to misallocation of resources and groups holding items sacred that are negative to end performance.

This change in viewing the world also requires also that every person accepts that their inputs, skills, and responsibilities are part of this system, and that it is rarely going to be a perfect match between what is best for their group and what is best for the organization as a whole. You are not defined by the rate output of your responsibility, but what you do with it. What matters is the ability to view everything as working together to improve the whole, which necessitates the need to not focus on individual groups, items, or interactions. When we are trying to generate value for the organization, and improve our bottom line, the least important item is what does that do to item X, or to section Y. That information is rarely meaningful as it is a single data point, is always going to cause a cognitive dissociation with what is best for the site and the whole. Getting people to act rationally is not inherent to the human condition, but is vital to getting the best results.

The easiest way to prove this simple dissociation between rate and value with testing is to do an inclusion/exclusion test and simply remove each item one at a time. If you know the rates before hand, or if you believe that an item is worth some value, then it would mean that you would drop that entire value when you remove the item. In reality, you will find little connection between that correlative value and outcomes, and will be shocked by how often you find things that you thought were valuable, but that turn out to be negative to the total page performance.

Testing is an amazing tool that allows you the ability to see the value of items. It is not very useful for the rate of outcomes, since we are trying to compare outcomes, but it gives you so much more insight then what you had before. It frees you up to see the world different and to tackle real problems that you could never tackle before. Understanding what information is telling you, what it isn’t, and how best to leverage different types of information together is what changes myth to reality in your use of data. In order to start down that path, you must first deeply understand the difference between rate and value, and understand that your job is not to focus on rate, but instead to discover value.

7 Deadly Sins of Testing – Testing Only What you Want

The many layers of what can run a testing program off track are both complicated and simple. All the major errors come from a need to fit an existing structure, to do what your boss wants, and most importantly to do what will make others happy. All the hard work that really defines success is the things that no one wants to do, be it agreeing on a metric, being the leader that your organization needs (even when they may not want it), or making sure people understand efficiency. All of these sins really are about defining how you are going to act when it comes time to do an action.

The next sin, only testing what you want, is the first one that is really about the action itself. It is the act of not just sitting together and “brain storming” or listening to a pitch about something that sounds great, and about incorporating the need to grow and learn in your actions, so that the path your group takes is organic and not inorganic. Groups get so caught up on only testing what the boss wants, or what your design people think will win, that they miss almost all the really important successes that can happen from a test. Every group starts out by wanting to test out one feature or another, and everyone hears about a best practice or a cool thing that another group did and they want to do the same. We fail to feed the system with a broad range of inputs because we can’t see past our own opinion, and because of that we dramatically lower the outcomes of our efforts.

Groups fail when they are too caught up on what wins, or on proving someone right. People are so caught up on validating an idea that they fail to see what other options will do, or even more, what if they are wrong? What matters more is not the discovery of a single validated idea, but the comparative analysis of multiple paths. This means that the worst thing we can do is limit or focus any effort only on what we want or limiting our efforts to what is popular. The sin in testing is the want to validate instead of learn, and the want to limiting of focus only to our own opinion. The truth is that the least important part of any test is what wins, since the value of the “win” is only as valuable as the context of that win. If we discover a 5% lift, that may be great, but if in the same test we could have had a 10%, 20%, or a 50% lift, then the 5% suddenly becomes an awful result. We get the most value when we are wrong, and when we discover this and allow ourselves to move down that path. Being aware of your ego, and not limiting your efforts to you or your bosses opinion, is what defines the magnitude of actual value you are achieving.

The sad truth is that we are often extremely unaware of the real value of our opinions. There is almost an inverse correlation between what people think will win, and what will win. One of the best ways to test this is to make sure that each test has a large number of very different variants and then do a poll before the test for what people think will win. You will find almost no connection between votes and outcome, which says a lot about our ability to measure things rationally after we have formed an opinion. This means that anytime that we only test what people think will win, or what they want to see will win, we have fundamentally crippled our ability to deliver meaningful value. Remember that if you only test two things, and the thing you want won, all you have done is added cost with the test. Challenge yourself and others to think in terms of possibilities and not opinions, and to scope things in terms of achieving the most options, and not just the set options.

One of the hardest tasks for groups to deal with is the need to assume that they know nothing about the value of an action. You are invested in proving to others that you know best, or in the value of any action that is already happening, but the sad truth is that most actions are done out of pattern and history and not because of measured value to an organization. Even worse, we build out giant project plans that we suddenly become inflexible to change or disrupt, being so focused on completion that we only pretend care about the actual value of the project. The need to take a step back and understand that, “if I am right, it will prove itself out. If I am wrong, it will show me a better way to do things” is easy to say but almost impossible to take hold of immediately. There is nothing worse then doing a massive project only to discover neutral or negative performance, yet this is by far the most common outcomes for groups when they test out large redesigns. It is vital that for programs as they build out that they not only test what they want to win, but that they build into all plans dynamic points where things can go directions not expected, so that we are not so inflexible to the reality of quantitative results.

Ask yourself these questions before you take any action: “how do I prove myself wrong?”, “what if I am focusing on all the wrong things?”, “what are the other feasible alternatives for this page, section, module?” It sounds counter intuitive, but it will help you understand just how much larger the testable world is from the world that you inherently would start out with. You are not limited to only what you want, you are only limited by your imagination and the efficient use of your resources. Force every action to fundamentally deal with these questions, and not the question of “how much better is this idea?”.

It is easy to limit the possible value of your program from thinking you are right. All people are wired to do so, and build their empires off the projection of this knowledge to others. Building out your tests to get past this sin, and instead find the right answer and to know how it measures to the larger world is vital to the level of value that you can get from your testing program.