Category: Rants

April 5, 2013

How Analysis Goes Wrong: The Week in Awful Analysis – Week #8

How Analysis goes wrong is a new weekly series focused on evaluating common forms of business analysis. All evaluation of the analysis is done with one goal in mind: Does the analysis present a solid case why spending resources in the manner recommended will generate additional revenue than any other action the company could take with the same resources. The goal here is not to knock down analytics, it is help highlight those that are unknowingly damaging the credibility of the rational use of data. What you don’t do is often more important then what you do choose to do. All names and figures have been altered where appropriate to mask the “guilt”.

If we were to really dive into the real world uses of data, there are two parts to every recommendation that an analyst makes. The first is that action is needed, and the second is what action to take. Fundamentally the problems arise when we confuse which one of these we have actual valid data for, and even worse when we convince others based on those flawed assumptions. While the core goal of analysis is to encourage action, we are creating a duplication of the same flaws that analysts rail against if we are presenting a course of action that is not based on factual data but instead on our own biases and opinions to which we simply attach data as a means of justification.

A perfect example of this is a very common and easy ROI analysis across channels. The multitude of problems with attribution are too long to get into here, but needless to say this is yet another example of people confusing rate and value, or more specifically attribution does not mean generation. Because of this, you can easily take this type of analysis, which can make a case to for action, as some sort of determination of what action to take.

Analysis: By looking at our different marketing channels and the revenue that we attribute to them, we find that Paid Search has an ROI of 145%, Display 76%, Email 250%, and Organic Search 112%. Based on this, we recommend that you move as much of your budget away from display and towards email and Paid Search.

The case you are making here is that you need to optimize your spend or that you can in fact make more money by improving what you do. I find this an ironic statement however in that I would hope that every group knows that they need to constantly improve, and those that don’t know that are not likely to accomplish anything if they do act. If that is not the case, then one must question what the point of any argument is. It is either an argument for the sake of avoiding either blame for past incompetence or pushing solely for the case of presenting evidence of growth, even if there is not any functional improvement. In either case, the story is secondary to the suggested actions derived from the data.

The real problems here lie completely with the suggested steps to improve. Let’s dive into the components of it:

1) Just because I can attribute 250% of revenue to email, it doesn’t mean that I actually GENERATE $2.50 for every dollar I pump into email. The problem here is that you are simply saying that people who interact with email ended up giving us X amount of revenue. You have no way of knowing if it was the email that lead to that revenue, or if people who make purchase and plan to again would be the same one’s signing up for more communication.

2) You have no clue if these channels are even causing revenue at all. It is possible that by not showing someone a paid ad for a specific item, that they would instead purchase a different item and generate more revenue. Even if you do not believe that is likely, you can’t in any way know how much of the revenue is generated solely from the channel.

3) Cross pollination of people hitting multiple channels is in here, so you had to pick an arbitrary method for assigning value. No matter what you choose, you are adding bias to the results and adding more confusion to the outcome.

4) Changes are not linear, so even if we moved revenue from one to another, you don’t get linear outcomes. You might not make a single dollar more.

5) You don’t know what is cannibalizing the other sections. It’s possible that paid is taking away from organic, or display from organic, etc…

6) Because you don’t know what is generating revenue, then it is just as possible that display is generating revenue more than any of the other channels. While I hate anything that looks like it isn’t even breaking even, if the analysis is to cut the lowest performer, we have no measure of what the lowest performer is.

7) The entire analysis doesn’t even look at correlation between spend fluctuations in the analysis, which is far from perfect, but at least can start to look at what incremental value you get from adding or reducing spend.

Marketers and managers love this type of analysis because it makes an easy story and it seems to have a clear action. The reality is that it can be the most damaging, not only to the rational use of data, but also the company because the story that is presented has no basis in reality. You could be right and nail exactly the best way to change, or you could be dead wrong, or somewhere in between. The sad reality is that if you keep the conversation solely at this level, then there is no way for you to ever know what the real outcome is or for you to be able to justify anything you do or say afterwards.

March 28, 2013

How Analysis Goes Wrong: The Week in Awful Analysis – Week #7

Sometimes the worst mistakes are those that we make when we are trying to impress others with our work. Nothing leads to less credibility then trying to make up numbers or use some form of flawed logic to make it look like our work is the only reason that the organization exists. Often times, these types of mistakes are the ones that we are least aware of. You will find that you can get away with this type of reporting in the short term, but as soon as someone tries to look behind the curtain groups are left without meaningful answers and often spend the rest of their time updating resumes instead of improving analysis. A perfect example of this comes from a very public source, and while I normally would not name the direct example, in this case it is such a well-known and “popular” speaker, that there is little need to masking identities. This week’s awful analysis comes from no other then Avinash and his blog Occam’s Razor.

Analysis – To show how much impact analytics has had, all you need to do is take the current revenue minus the past revenue, times it by the time and divide by the cost and you get the ROI of analytics.

It is often times a good thing for people in analytics to understand how marketing can take data and exploit it for personal gain, but it is something very different when we fall into the exact same traps. There is so much wrong here that I hardly know where to begin, but let’s look at the highest level problems:

1) It attributes 100% of revenue gain to analytics, which ignores the fact that data is sinusoidal, meaning that it goes up and down all the time with no direct interaction. The complexities of the entire marketplace make it nearly useless to use a pre/post type of analysis with any kind of accuracy.

2) It assumes that the same revenue and resources that were spent here would not have been spent elsewhere. The people in these departments are not stupid, and while they may have followed 100% of your suggestions, it doesn’t mean that they would not have gotten more by doing what they would have otherwise. If they would have generated 150% increase, and your suggestions generate 120% increase, you actually lost 30% of revenue, not gained 120%. The same can be said for looking at a test only for what won, and not the difference between what won and what would have won if the test had only been what the original idea was. The difference of the analysis in both cases is the expansion of opportunities, not the original opportunity itself.

3) It focuses on the suggestions, not on the accuracy of the actual work. Some of the worst people with data are analysts, as they have the most opportunity to abuse the data to push their agenda over someone else’s. Nothing is worse for the industry as a whole then analysts who do not understand the limitations of their own opinions and the need for rational uses of data, not stories and meaningless suggestions.

4) The entire point is someone trying to find revenue opportunities only from only passive interaction with data. There is no way to know the influence or cost of an action in passive data (correlative), so why then would you take any credit for additional revenue? This is the classic analyst fallacy of pretending that the case for action and the action to take are both available in the same data. Active (causal) data acquisition is the only way to get these pieces of information, yet we are trying to make a claim that requires this information without it. There is nothing worse for rational uses of data then this obvious an abuse of personal agenda by the analytics team itself.

I have seen this type of analysis done in many different ways over the years. In all cases, it is a clear sign that the person doing the analysis both clearly does not understand what the data they report means, but also that they are only using their position for personal gain and not organizational gain. I believe that data can add rationality and improve performance of organizations in magnitudes, but the only way to do that is for the analyst themselves to rationally use their owndata. It is important that people understand your impact to the bottom line, but there are many ways to do this that do not require false statements and personal agendas.

Analytics can be a powerful tool that can shape organizations, but it can also be a weapon used to push one person’s agenda versus another, with the result being no gain to the organization but more internal politics. We have so much talk about the power of data, and about the potential of big data, and while you can use it to predict things and build tools that leverage it, the reality is that until we have people running programs that are interested in the real impact to the business, that all the “promise” will be nothing but empty air. Just because you have analytics or just because you build a recommendation or a tool that uses data does not inherently mean it is providing value to anyone. Value is the additional growth of revenue in the most efficient way possible; it is not some flashy toy or your own personal agenda. If you want to really see data use expand throughout your organization, then stop abusing it yourself and help others see the real power of being able to explore and exploit information.

March 21, 2013

How Analysis Goes Wrong: The Week in Awful Analysis – Week #6

There are many different ways to abuse statistics to get meaning out of data that it really doesn’t have. I often find myself quoting George Box, “All models are wrong, some are useful.” Since “big data” seems to be driving people to more and more “advanced” ways of slicing data, I wanted to start looking at some of the most common ways people misunderstand statistics, or at least what you can do with the information that is presented by the use of various modeling techniques.

The first technique I want to tackle is clustering, or K-means clustering. This type of modeling allows people to divide a group of users into common characteristics. The analysis looks at various dimensions of users who end up doing some task, usually a purchase or total revenue spent, and then statistically derives the defining characteristics of one group that differentiates it from another. This is very similar to decision tree methods as well, but tends to be much more open on how many groups and what dimensions are leveraged.

A typical use of this data would be:

Analysis: Looking at our data set of converters, we were able to build 3 different personas of our users, low value, medium value, and high value users. Based on the fact that high value users interact with facebook campaigns and internal search far more than others, we are going to spend resources to improve our facebook campaigns to get more people there, and look to make internal search more present for users in the other 2 persona groups.

Before we dive in, I feel it is necessary to remind everyone that the point here is not that optimizing facebook campaigns or internal search is right or wrong, only that the data presented in no way leads to that conclusion.

1) Fundamentally one of the problems with clustering is that it only tells you common traits of people in a group. You have no way of knowing if moving people who do not normally interact with your facebook campaigns to do so, if they will in any way behave like those that currently do. Most likely they won’t, and you have no way of knowing if that will generate any revenue at all.

2) There is a major graveyard effect going on, where looking at only those that convert and looking for defining differences avoids looking at the total population and looking at the differences between those that do convert and those that don’t. There is a pretty good chance that people who don’t convert also use internal search.

3) Even if you assume that everything is correct and that the two areas are high valuable, you still don’t have a read on what to change, what the cost to do so is, or even what your ability to influence anything about that group is. You still have no way of saying that this is more valuable then any other action that you could take (including not doing anything).

4) It assumes that just because those are the defining areas, that the place to interact with people is also that same place. It is just as likely that getting people to sign up for email also gets them to look at content on facebook for example.

5) Personas as a whole can be very dangerous, since they create tunnel vision and can lead to groups not looking at exploitability of other dimensions of the same population.

6) At the highest level, it confuses statistical confidence with colloquial confidence. The statistics tell you that these different characteristics are statistically different enough to create a known difference in groups. It in no way tells you that these differences are important to your business or how you change your users behavior.

I am actually a huge fan of statistics and data modeling, but only as a way to think about populations. I get very worried when groups following the results blindly, or do not actively interact with the data to see about important information, like cost and influence. If you have an analysis like this and you know the exploitability of the different dimensions, then you can do another step of analysis and look for size and ability to change the population based on what you know of the exploitability of the defining characteristics. If you do not have that, then the data is interesting but ultimately almost useless.

March 15, 2013

How Analysis Goes Wrong: The Week in Awful Analysis – Week #5

I once again heard the same common mistake in determining where to test earlier today, and it reminded me of the topic for this next Awful Analysis. It is super easy to find data that you believe is “interesting” or “compelling” that in no way actually makes a rational argument that you are attributing to it. An example is as follows:

Analysis: We found that people that interact with internal search spend 1/3 as much as people who don’t. This tells us that there is a massive opportunity to optimize internal search and increase total revenue.

This is probably the most common type of search for “test ideas”, as it sounds like a rational argument and it is using real numbers. The problems come from the fact that the data presented is a non sequitur as far as what to test. These types of arguments are interesting stories and I actually do not always suggest you do not leverage them. My problem is when people start to believe the story and do not realize just how irrelevant the information presented is.

As a reminder, the only way to know what the efficiency or value of a test is requires three pieces of information: Population, Influence, and cost. With that in mind, I want to dive into this type of analysis:

1) You have no way of knowing from the analysis (or any correlative information) if people who spend less do so BECAUSE of the use of internal search, or if people who are going to spend less are the ones who aren’t quite sure what they are looking for and instead choose to buy cheaper things. It could also be that you get a lower RPV because people doing research ARE far more likely to use search to compare items.

2) You have no clue that even if you are right that they are more valuable, if the search results page is the place to influence them, or if it is the entry channel? Or the landing page? Or maybe the product page?

3) You have presented no evidence to your ability to influence that group even if you ignore #1 and #2. Even if you have the perfect group and the perfect place, you still have no insight into what to actually change.

4) There is nothing presented that says that this same group cannot be improved far more dramatically by looking at and interacting with them based on other population dimensions. Last I checked, new users, search users, purchasers, and IE users also use internal search.

5) There is no look at the cost to change this page (and population) versus known results or even just the technical ramifications. Search Results pages are often one of the one or two hardest pages to test simply from a technical resources and page interaction front.

More than anything, the threat of this type of analysis is that it sounds perfectly rational. Who wouldn’t want to “fix” a group that is spending 1/3 as much as another group? Why aren’t all users spending their entire paycheck on my site and my site only? You have to make sure that as the analyst that you are presenting rational fact based data if you expect anyone else to leverage data in a rational manner. You might be right or you might be wrong, but if you do not stop yourself from falling for these stories and do not hold yourself to a higher standard, then how can you expect anyone else to. If you are going to find data to tell a story, then what is the point of the data other than to present your opinions versus someone else’s?

Share this:

Share this:

Share this:

Share this: