How You can Stop Statistics from Taking Advantage of You – Part 1

You can’t go five minutes in the current business world without the terms big data, predictive or statistical tool being thrown about. If one was to believe all of the hype you would have no problems making perfect decisions, acting quickly, and all everyone would be improving their performance by millions of dollars every hour. Of course everyone in the field also acknowledges just how far everyone else is from that reality, but they fail to mention the same errors in logic from their own promises and their own analysis. All data is leveraged using mathematical tools many of which do not have the level of understand that are necessary to maximize their value. Data can both be a powerful and important aid to improving business and a real deciding factor between success and failure. It can also be a crutch used to make poor decisions or to validate one opinion versus another. The fundamental truth is that nothing with “big data” is really all that new, and that in almost all cases, the promises that you people are making have no basis in reality. It is vital that people understand core principles of statistics that will enable them to differentiate when data is being used in either of those two roles and to help maximize the value that data can bring to your organization.

So how then do you arm yourself to maximize outcomes and to combat poor data discipline? The key is in understanding key concepts of statistics, so that you can find when and how promises are made that cannot possibly be true. You do not need to understand the equations, or even have masterly level depth on most of these topics, but it is vital that you understand the truth behind certain types of statistical claims. I want to break down the top few that you will hear, and how they are misused to make promises, and how do you really achieve that level of success.

Correlation does not Equal Causation –

Problem- I don’t think anyone can get through college without having heard this phrase, and most can quote it immediately, but very few really focus on what it means. The key thing to take from this is that no matter how great your correlative analysis is it can not tell you cause of the outcome nor the value of items without direct active interaction with the data. No matter how much you can prove a linear correlation or even find a micro-conversion that you believe is success, by itself it can never answer even the most basic of real world business questions. They can be guiding lights towards a new revelation, but they can also just be empty noise leading you away from vital information. It is impossible to tell if you leave the analysis at just basic correlation, yet in almost all cases this is where people are more then happy to leave their analysis. The key is to make sure that you do not jump to conclusions and that you incorporate other pieces of information instead of blindly following the data.

Just because I can prove a perfect correlation between email sign-ups and conversion rate, that they both go up, I can never know from correlation alone if getting more people to sign-up for emails CAUSED more conversions, or if the people we got to convert more are also more interested in signing up for email. In a test this is vital because not only is it easy see those two points, but you are also limited with only a single data point making even correlation impossible to diagnose. It is incredibly common for people to claim they know the direction and that they need to generate more email signups in order to produce more revenue, but it is impossible to make that conclusion based on purely correlative information alone and it can be massively damaging to a business to point resources in a direction that can equally produce negative and not positive results.

The fundamental key is to make sure that you are incorporating consistent ACTIVE interaction with data, where you induce change across a wide variety of items and measure the casual value of them. Combined or leading your correlative information you can discover amazing new lessons that you would never have learned before. Without doing this the data that many claim is leading them to conclusions is often incomplete for fundamentally wrong and can in no way produce the insights that people are claiming. The core goal is always to minimize the cost of this active interaction with data while maximizing the number and level of alternatives that you are comparing. Failure to do this will inevitably lead to lost revenue and often false directions for entire product road maps as people leverage data to confirm their opinions and not to truly use data rationally to produce amazing results.

ExamplesMultiple success metrics, Attribution, Tracking Clicks, Personas, Clustering

Solution – Causal changes can arm you with the added information needed to answer these questions more directly, but in reality that is not always going to be an option. If nothing else, always remember that for any data to tell you what lead to something else, you have to prove three things:

1) That what you saw was not just a random outcome

2) That the two items are correlated with each other, and not just some other change

3) That you need to prove causal direction to be able to prove any conclusion

Just the very act of stopping people from not racing ahead or abusing this data to prove their own agenda will dramatically improve the efficiency of your data usage as well as the value derived from your entire data organization.

Rate vs. Value –

Problem – There is nothing more common than finding patterns and anomalies in your analytics. This probably is the single core skill of all analysis, yet it can often be the most misused or abuse actions taken with data. It can be segments that have different purchase behavior, channels that behave differently, or even “problems” with certain pages or processes. Finding a pattern or anomaly at best is simply the halfway point of actionable insight, not the final stop to be followed blindly. Rate is the pattern of behavior, usually expressed as a ratio of actions. Finding rates of action is the single most common and core action in the world of analytics, but the issue usually comes when we confuse the pattern we observe with the action to “correct” that action. Like Correlation vs. Causation above though, a pattern by itself is just noise. It takes active interaction and comparison with other less identified able options in order to validate the value of those types of analysis.

Just because Google users spend 4.34 min per visit or email users average visit depth is 3.4 pages are examples of rates of action. What this is not is the measure of value of those actions. Value is the change in outcome created by that certain action not the rate at which people happen to do things in the past. Most people understand “past performance does not ensure future outcomes” but they fail to apply the same logic when it comes to looking for patterns in their own data. Value is expressed as a lift or differentiation, things like adding a button increased conversion by 14% or removing our hero image generated 18% more revenue per visitor.

The main issues come from confusing the ability to measure different actions with knowing how to change someone’s behavior. The simplest example of this is the simple null hypothesis of what would happen if that item wasn’t there? Just because 34% of people click on your hero image which is by far the highest amount on your homepage, what would happen if that image wasn’t there? You wouldn’t just lose 34% of people, they would instead interact with other part of the page. Would you make more less revenue? Would it be better or worse?

It also comes down to two different business questions. At face value the only possible question you could answer with just pattern analysis is, “What is an action we can take?”, in the ideal value business case you would instead answer “Based on my current finite resources, what is the action I can take to generate the most X” where X is your single success metric. Rates of value have no measure of ability to change or of cost to do so, and as such they can not answer many of the business questions that they are erroneously applied to.

ExamplesPersonalization, Funnel Analysis, Attribution, Page Analysis, Pathing, Channel Analysis

Solution – The real key is to make sure that built into any plans of optimization you are incorporating active data acquisition and a that you are always measuring null assumptions and measuring the value of items. This information combined with knowledge of influence and cost to change can be vital, but without it is likely empty noise. There are entire studies in math dedicated to this, with the most common being bandit based problem solving. Once you have actively acquired knowledge, you then will start to build information that can start to inform and improve the cost of data acquisition, but never replace it.

These are but two of the many areas where people consistently make mistakes when leveraging data and concepts from statistics to make false conclusions. Data should be your greatest asset not your greatest liability, but until you help your organization make data driven decisions and not data validated decision there are always going to be massive opportunities for improvement. Make it a focus to improve your organizations understanding and interaction with each of these concepts and you will start using far less resources and making far better outcomes. Failure to do so also insures the opposite outcomes over time.

Understand data and data discipline have to become your biggest areas of focus and educating others your primary directive if you truly want to see your organization take the next step. Don’t let just reporting data or making claims of analysis be enough for you and you will quickly find that it is not enough for others.

Rant – Testing is about the Driver, not the Car

I recently answered a question on the value of testing on Quora and was asked to re-post my response here by a few people I know in the industry.

Question: I’ve been all about A/B testing, but then I just read this post from Erik Severinghaus. Is A/B testing as valuable as we think?

Answer: Like many things in life, the answer is not that simple. Think of it like driving a car, there are good drivers, slow drivers, oblivious drivers, angry drivers, and skill ranges from low to professional. The issue is in the driver, not in the concept of a car.

Testing is much the same way. The reality is that in many organizations (including many that champion testing to death) there is very little value in how they are leveraging testing. There are many cases where testing is actually costing those companies money, because they are not disciplined in how they approach things, they focus on idea validation and do not understand how to act on data, doing things like blindly following statistical confidence. If you look at how he describes testing in that blog post, then this is where those people are at. If MVT is simply a way to throw a bunch of items against a wall and choose a winner, then you know that you are firmly in this realm. In those cases, I would argue that testing is worse then a mouse pad, it is more analogous to a cup holder in a car. It is there, people use it, they get enjoyment out of it, but it has nothing to do with where the car ends up or how fast it gets there.

There are other organizations which look at data differently and who use testing in a different manner, one used to focus and leverage resources and one that is not used for validation or “choosing 2 headlines”. In those situations, there is very little that can be said to describe just how valuable testing is. Testing changes the direction of entire organizations, it proves people wrong, it focuses resources and it allows for the exploration of alternative feasible options and allows you to really know the value of actions, not just argue them. It is a tool whose use is to find out what the most valuable of many different routes are, and then help you drive down those roads providing more and more value at each step. Those scenarios are more analogous to testing being the GPS, describing routes and shorter distances, as well as helping maximize time and fuel.

In both cases, there are ways to automate the process to lower decision time and to increase the efficiency of the test itself. That doesn’t address the real problem however which is if the entire vision of testing is wrong, then it doesn’t matter what system you use to make decisions or how you leverage MVT. It really doesn’t matter what sized drink goes into the cup holder, or how many different drinks can be placed there over time. If you are going down the other thought route, then how fast your GPS updates, what information it uses, and what factors you use to decide routes can have a massive impact on where you end up.

Why we do what we do: Why do we Shoot the Messenger? – Backfire Effect

At some time, everyone that works in data has had to deal with the following scenario:

You run a test or you do an analysis that shows that a member of management has been claiming something or pushing something that is clearly wrong. You present the data, and then they push back even harder saying that you just don’t understand or there must be more to the story. You dive back in, find more and more supporting data, you make charts and breakdowns and present them again. This time instead of just pushing back your recipient start attacking you and everything you do. They may do it overtly or behind the scenes, but they now view you as a problem and a threat. They never change their view of your original point, and now they distrust you and are looking for opportunities to attack your work.

This is a way too common outcome in the business world, and one that is not actually limited to the use of data. What you are experiencing is the Backfire Effect, or the fact that people become stronger in their beliefs when presented with evidence that directly contradicts them.

So why does this happen? Why is the data you are clearly presenting, data that multiple others agree with and buy into not having its desired effect? It is because you have started to attack their world view. Every person you ever work with believes that they do superior work, believes that they make a large impact to the business, and that they hold a deep understanding and correct view of how things work. When you present direct evidence against this, you are not actually attacking the statement, but their self-perception, which creates a level of cognitive dissonance, resulting in an ad hominem attack on the messenger, and a blind ignorance of the evidence.

Like most psychological biases the key is to set the stage for success prior to action, not after. You may not be able to force rationality into individuals or organizations, but you can certainly push discipline. Define rules of action before you start and task, work to get agreement on what will define success, and what follow up action should and will be. Often times these conversations are pushed, ignored, or dismissed, but it is up to you as the one who will ultimately be sharing the news to force this as a priority of a conversation.

No one you work with will want to talk about how you make a decision; they will want to talk about their great idea for a test, or for a group to target to, or their amazing advertising campaign. They have already decided what they want, why it is great, and what you will present in the end. If you only allow or enter the conversation at this point, you role in their subconscious mind is simply to validate their opinion. The job of those that work in data is to never give into this path, no matter how easy it is or how it may help us politically. You can not view success ever as how many actions you fulfill, but instead the value of the ones that you fulfill. The instant you allow for quantity of action to take precedence over quality of outcomes, you are setting yourself and others up for this type of failure. It is instead to be the holders of discipline, to be the ones that help create opportunities to find out the faults in these ideas, to not be the ones to validate held world views.

This is also why changing the conversation about what it means to be “right” and “wrong” is so important. If you shape each conversation to talk about the amazing outcomes of being “wrong”, of going in a not previously encouraged direction and about the impact to the business, you are opening the door for individuals to not have their world view attacked. If you allow others to understand that they have impacted the business, and that they have succeeded in their end goal of finding out people cases where they are wrong, you have enabled them to not fall into the Backfire Effect. Changing the conversation away from the faults of one idea and towards the value of different options and why choosing this action allows you to not attack someone’s world view and instead help them look good by giving them the tools to find an outcome, not just an input to a failed system. It is important that you understand deeply why you need to do this, what the traps are, and what the right way to frame that conversation is, but if you are willing to do the ground work you can achieve amazing results.

One of the defining characteristics of organizations who get value from their data versus those that don’t is that the leaders who manage their data focus on the leading conversation, not on the stories they can tell after their analysis. This problem is only exasperated by egos and by the fact that so much of the material and talk in the industry is filled with justifications for those that do not want to address the real issues at hand. Much of the data marketplace, from managers to agencies, is filled with those that would come up with creative ways to tell people exactly what they want to hear and to come up with a story that shows impact, even if there is no factual basis for that claim. There are articles, speakers, and “experts” throughout out who have mastered the art of sounding intelligent without actually adding anything new or functional to the organizations of which they address. There are many groups who have their own biases in believing their value is presented, just like any other group, because they focus simply on the actions some takes or on their ability to make a recommendation. your key responsibility is to focus the same skills and control the message in the same way towards that which will actually drive value for the organization, not that which sounds good but is hollow. It is vital that from day one and onwards that leaders control and help shape the conversation instead of responding continuously to requests. Successful organizations define actions and successes, focus on discipline, and prepare for action before the data, not after.

There is no more true statement then: “Success and failure is determined before you act, not after.”

There is zero chance of you avoiding push-back if you fail to do the dirty work of setting the stage properly. If you create an environment where you don’t focus on the idea but instead on the discovery, on the outcome and not the input, and work with groups to add value to their ideas instead of facilitate their ideas, you will find amazing results achieved throughout the organization. Ultimately you need to be agnostic about what wins and loses, and instead focus on how people arrive at a decision and if it answers the correct business question. Shy away from this aspect of the job and you face the challenge of dealing with the backfire effect or finding ways to justify actions you rationally know are not valuable.

If you want to avoid painful confrontations, you will always have two options. Option one is the easier one, convince yourself that presenting data that supports people politically or that just getting someone to act is somehow providing value. In this option you will never be delivering news people don’t want to hear. The second option is to focus on the painful disciplines prior to actions and to deal with some discomfort before you get too far and stay away from anyone’s ego. In this you will have to deal with some discomfort, but you will be able to make a true and meaningful impact to your organization. Other departments, executives, and even your own management will never be able to make this decision for you, this decision is a personal one and one that you either choose to make, or one that is chosen for you.

How Analysis Goes Wrong: The Week in Awful Analysis – Week #9

How Analysis goes wrong is a new weekly series focused on evaluating common forms of business analysis. All evaluation of the analysis is done with one goal in mind: Does the analysis present a solid case why spending resources in the manner recommended will generate additional revenue than any other action the company could take with the same resources. The goal here is not to knock down analytics; it is help highlight those that are unknowingly damaging the credibility of the rational use of data. What you don’t do is often more important than what you do choose to do. All names and figures have been altered where appropriate to mask the “guilt”.

I have a special place in my heart for all the awful analysis that is currently being thrown around in regards to personalization. So many different groups are using personalization as the outward advantage prevalent with big data. No matter where you go, ad servers, data providers, vendors, agencies, and even internal product teams, they are all trying to talk about or move towards personalization.

This is not to say that personalization is a bad thing, I believe that dynamic experiences can produce magnitudes higher value then status experiences and have helped many groups achieve just that. What most surprises me however is the awful math being used show the “impact” of personalization from groups who have achieved absolutely nothing. I have lost count the number of times I have walked into and found one person or group talking about how personalization has improved their performance by some fantastic figure that it seems that the business should be doing nothing but thank them for their genius. The sad reality is that most of the analysis is biased and bad that in most cases the same companies are actually losing millions by doing this “personalization” practice.

Analysis – By putting in place personalization, we were able to improve the performance of our ads by 38%.

We have to tackle the larger picture here to evaluate statements such as above. Before we dive too deep into how many things are wrong with this analysis, we need to start with a fundamental understanding of one concept. There is a difference between the changing of content or the user experience and the targeting portion of that experience. In other words, changing things will result in an outcome, good or bad, and then targeting specific parts of that change to groups is also going to lead to an outcome. The only way that “personalization” can be valuable is if that second part of the equation is the one leading to a higher outcome.

1) Just to get the obvious out of the way, the analysis doesn’t tell you what the improvement was. Was it clicks? Visits? Engagement? Conversion? Or RPV? If it is anything but RPV, then reporting any increase has no bearing on the revenue derived for the organization. Who cares if you increased engagement by 38% if total revenue is down 4%.

2) The only way that “personalization” can be generating 38% increase would be if the following was true:

The dynamic changes of content raised performance by 38% to total RPV over any of the specific static content or content served in ANY other fashion.

In other words, if I would have gotten 40% increase by showing offer B to everyone, then personalization is actually costing us 2%.

3) Since most personalization is tied to content and the inherent nature of content changes is very high initial difference and then normalizing over time, what is the range of outcome? What is the error rate? The inherent nature of any bandit problem with would use causal data to update content means that you either have to act as quickly as possible, resulting in higher chance of error, or act slow and risk the chance of not responding fast enough to the market. In either case, performance will never be consistent.

Rather than continue to dive through each and every biased and irrational part of this analysis, I want to instead present two ways that you can test out these assumptions to see the actual value of personalization:

Set-up: Let’s say that you believe that 5 different pieces of content are needed for a “personalized” experience. In other words, you have a schema that will change content by 5 different rules.

The same steps work for anything from 2 rules to 200.

Option #1 (the best option):

Serve all 5 pieces of content to 100% of users randomly and evenly. Look at the segments for the 5 rules AND all other possible segments that make sense.

You will get 1 of two outcomes:

1) Each piece of content is the highest performing one for that specific segment and those are the highest value changes

Or

2) ANY OTHER OUTCOME which by definition in this case results in more revenue.

Option #2

Create dynamic logic in the tool, based on the 5 rules.

Create 6 experiences.

Each experience except the last shows each piece of content one at a time to all users (so content matching group A actually gets served to all 5 user definitions in recipe A). In the last recipe, then add the dynamic rules to the last experience.

If the last experience wins, then you at least know that the dynamic content is better than static content. If you are looking at your segments correctly, you will then also be able to calculate the total lift from other ways of looking at the content to the dynamic experience that you tested. If the dynamic experience is still the top performer, congratulations on being correct. If any other way works best, congratulations on finding more revenue.

In both of these tests, if something else won, then by doing what you were going to do or what you would otherwise report on IS COSTING THE COMPANY MONEY.

There are massive amounts of value possible by tackling personalization the right way. If you do rational analysis that looks for total value, then you will find that you can achieve results that blow even that 38% number out of the water. Report and look at the data the same way that most groups do though, and you are ensuring that you will get little or no value, and that you are most likely going to cost your company millions of dollars.