Optimizing the Organization: Maximizing the Time and Focus of your Program

Because of the nature of how most organizations work, it is common to find testing added onto the existing roles or organizational structure of the analytics group. It makes sense from a very high level view as both disciplines deal with using data to make decisions and both can be viewed as a shared resource throughout the entire organization and for each group. The failure comes however when people who are used to looking at and responding to problems like they are analytics issues try to force the same actions onto testing.

The real key to success in adding optimization to your organization is in how you tackle the fact that it is a new discipline then in the whom or where it fits into that larger picture. Most groups worry about head count and resource and fail to focus on the skills and different actions that determine success with optimization as opposed to general analytics or business intelligence work. The lack of knowledge by others is often abused or leveraged to help people gain oversight onto this project. The real problems lie however when people do not then adapt usage to meet the new needs and instead simply try and come up with stories as to the value of the program.

Once mature and with key people in place combining the programs can add a lot of value. Without deep understanding of those differences however, the inevitable conclusion is less value and wasted time as people make mistakes that they do not even know are mistakes. Resources are a precious commodity, and the ultimate expression of optimization is to leverage them in the most productive way possible. In order to ensure this outcome, it is important that you focus the time of your optimization team towards actions that will maximize outcomes and help grow understanding for your entire organization.

To maximize success and to ensure that focus is done on the right actions, here is a breakdown of the time spent and the value derived from the main actions of a successful optimization team.

1) Active Data Acquisition80% – 90% of the value of most optimization programs comes not from the commonly thought of validation role but in continual active data acquisition and comparison of feasible alternatives. It takes time for groups to achieve this role, but when they do the value given to the organization increased by magnitudes. Often this is based around the concepts of bandit based optimization and fragility and is used as an ongoing effort to challenge assumptions and to actively measure the value of different alternatives.

In this role the optimization team consistently leverages low resource efforts that consistently measure as many different feasible alternatives as possible and do this across the site. This is in the attempt to maximize the discovery of exploitable opportunities and the primary role is to challenge assumptions which otherwise will never be discovered. The team needs access to pages and clear rules on measuring success and leveraging of resources in order to produce constant and impactful lessons which can shape and direct product discussions and roadmaps.

2) Education – This is 5-15% of the value derived from the optimization group, but provides the ability to do the actions which produce the greatest returns. Because optimization requires very different ways of thinking about and executing on actions in order to provide the most value this means that one of the key roles for optimization is an ongoing and consistent conversation with groups about different ways to think about problems.

It is vital that this conversation always happens prior to any action and is that optimization is not just thought of as a simple action in a release calendar. Groups that fail to think different are guaranteed to get much lower value from their testing efforts, waste far more resources, and often have much slower and less productive product teams overall. They will get results, they will simply be far smaller results at a much higher cost. Failure to focus on education often leads group to a purely responsive role and leads to programs that are happy with the number of tests they run or by simply producing a single positive result.

There is no such thing as an organization that starts out looking at the correct things perfectly and without fail someone’s personal agenda leads them to subconsciously search out confirming actions and data in order to make themselves look good. Building, maintaining, and educating people on proper data discipline is the single most consistent and important topic of education.

Here are a few of the key topics that groups can and should focus on:

i. Rules of Action – Knowing how to act on data and to be disciplined in not acting too fast or too slow and looking at only metrics that matter to a decision are vital for any data organization.

ii. Statistical disciplines – There are many different ways to think about data and testing and it is vital that people be exposed and open to different ways then they are previously aware in order to maximize future growth.

iii. Psychological disciplines – Optimization hits on many psychological disciplines and concepts such as Confirmation bias, Forer effect, Congruence bias and many others.

iv. Knowledge Share – When you are running a successful optimization program you will be constantly learning things that go against all previously held beliefs and opinions. These lessons learned are the single most valuable part of a successful program and become a core component of a program once it has matured.

3) Ad hoc analysis and validation testing At most this represents 5% of possible value provided by testing – It is always fun to want to focus on who has the best idea to improve things, but ultimately this is the least important part of a successful program. The better the input, the better the output, but only if it is going through a great system. A poor system means that it really doesn’t matter how good any idea is.

This is the part that most groups are familiar with, where they respond to test ideas directly or asking for more data / details on specific tests.

Generally this time is best spent redirecting towards higher value uses of time and value of data.

Successful programs have their time breakdown in the range of:

Active Data Acquisition / Ongoing optimization – 60-70% of time
Education – 15-20% of time
Post hoc analysis and validation optimization – 15-20% of time

It is generally more about the thought and usage of the program then just who owns it. The real keys to a successful program are to differentiate the roles and skills. Generally if a program is just starting out you they may have 1-2 people who work on testing in some form, with the primary focus on working with different groups to work with their ideas to provide value. Mature programs might move upwards to 5-10 people and higher as they continue to grow.

It does no good to add more people to a problem if you aren’t fixing the real problem, which is how the time is spent. More time does not equal more value, better usage of time means more value. It is never easy to go past your comfort zone, but that is where you will find all the value. Think about how and where you are spending your time.

End the Debate: Which should you use? Visitor vs. Visit

There are many challenges for anyone entering a new field of study or a new discipline. We are all coming into any new concept with all of our previous held knowledge and previous held beliefs filtering and changing how we view the new thing before us. Some choose to make it fit their world view, others dismiss it from fear, and others look for how it can change their current world view. Usually in these situations I quote Sherlock Holmes, “It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.” Nothing represents this challenge more in online marketing then the differences between analytics and optimization, and nothing represents that struggle more than the debate about visit based measurement versus visitor based measurement.

The debate about should someone use visits, impressions, or visitor basis for analysis is a perfect example of this problem, as it is not as simple as always use one or the other. When you are doing analytics, usually visits are the best way to look at data. When you are doing optimization, there is never a time where visits would present to you more relevant information then using a visitor based view of the data.

Analytics = Visit
Optimization = Visitor

The only possible exceptions are when you are using adaptive learning tools. While the rules can be simple, a deep understanding of the way presents many other opportunities to improve your overall data usage and value derived from every action.

Since most people reading this start in an analytics background, let’s look at what works best in that environment. Analytics is a single data set correlative data metric system, which is a long way of say, it counts things on a consistent basis and only one set of data, even if that data has many different dimensions. You are only recording what was, not what could or should be. In that environment, you have to look at data in some very particular ways. The first amongst those is a very tight control on accuracy, since in many cases the use of that data is to represent what the business did, and to hopefully make predictions about the future.

It is also important that you are consistent with how you measure and that you look at things in a common basis. Because most people are comfortable looking at a day or shorter term basis, this means the easiest method is going to be a visit. It is works great because you are trying to look at interactions and to measure in a raw count of things that did happen, e.g. how many conversions, or how many people came from SEO. In those cases, a raw count in a correlative area is going to be best represented using a visit basis, since it mitigates lost data (though it is not a massive amount) and it best reflects the common basis that people look at data.

In the world of optimization however, you have a completely different usage and type of data. In optimization we are looking at a single comparative data point, and trying to represent an entire different measure, which is influence on behavior over time. It doesn’t matter if your site changes once a year or once an hour, or if your buying cycle is 1 visit or 180 days, all of those things are irrelevant to the fact that you are influencing a population over time. Because behavior is defined as influence on a population, and because we are looking comparatively over time, the measurement techniques used in analytics need to be rethought. Any concern about accuracy, past a simple point, become far less important than a measure of precision (consistency of data collection) since all error derived is going to be equally distributed. It doesn’t matter if the common basis is $4.50 or $487.62, what matters is the relative change based on the controlled factor. It is also important that we are focusing far more on the influence then the raw count, which means we are really talking about the behavior of the population.

In analytics you are thinking in terms of, what was the count of the outcome (rate) as opposed to in optimization the focus is on what was the influence (value). To really understand optimization, you have to understand that all groups start with a standard propensity of action which is represented by your control group. If you do nothing the people coming to your site, people in all stages and all types of interaction, measure up to one standard measure across your site (though all measurement systems do have internal variance in a small degree). Since we are measuring not what the propensity of action is but what are ability to positively or negatively influence it is, we need to think in terms of reporting based on visitors and based on the change (lift) and not the raw count.

You also have the case of time, where we need to measure total impact over time. While it is correct that every time a visitor hits your site you have a chance to influence them, it is important to remember that the existing propensity of action measurement already accounts for this. What we are looking for is a simple measure of what did we accomplish by in terms of getting them to spend more. This means that we have to think in terms of both long and short term behavior. Some people will purchase today, some 3 visits later, but all of that is part of standard business as usual. It is incredibly easy to have scenarios where you get more immediate actions but less long term actions. This means that on a daily basis you might see a short term spike, but for the business overall you are going to be making actually less revenue. This possibility creates two possible measurement scenarios:

1) There is no difference between short term and long term behavior, meaning the short term spike continues through and is positive also in the long term. In this scenario the only way to know that is to look at the long term.

2) There is a difference and short and long term behaviors differ and we are getting a different outcome by looking at the visitor metric over time. In this scenario the only positive outcome for the business is the visitor based metric view.

In both cases the visitor based metric view gives us the full picture of what is good for the business, while the visit based metric system either has no additional value or a negative value by reaching a false conclusion. In either case the only measure that adds value and gives us a full picture is the visitor based view of the world. We have a case where visitor is both the most complete view, no matter the situation, but the only one that can give you a rational view of the impact of a change. To top it off, the choice to only look at the shorter window creates a distribution bias, by valuing short term behavior over long term behavior, which may create questions into the relevance of the data used to make any conclusion.

The visitor vs. visit based view of the world is just one of many massive differences that reduce the value derived from optimization if not understood or not evaluated as a separate discipline. Because it is so easy to rationalize sticking with what is comfortable, it is common to find this massive weakness being propagated throughout organizations with no measure of what the cost really is. While not as damaging as others, like not having a single success metric or not understanding variance, it is vital that you are thinking about visit and visitor based data as attached the end goal and not as a single answer to everything.

In the end, the debate about which version to use is not really one about visits or visitors, there are clear reasons to choose visits for analytics and visitor for optimization. The real challenge is if you and your organization understand the different data disciplines that are being leveraged. If you constantly look for different ways to think about each action you will find new and better ways to improve value, if you fail to do so you will cause damage throughout your organization and will not even know you are doing it.

How You can Stop Statistics from Taking Advantage of You – Part 1

You can’t go five minutes in the current business world without the terms big data, predictive or statistical tool being thrown about. If one was to believe all of the hype you would have no problems making perfect decisions, acting quickly, and all everyone would be improving their performance by millions of dollars every hour. Of course everyone in the field also acknowledges just how far everyone else is from that reality, but they fail to mention the same errors in logic from their own promises and their own analysis. All data is leveraged using mathematical tools many of which do not have the level of understand that are necessary to maximize their value. Data can both be a powerful and important aid to improving business and a real deciding factor between success and failure. It can also be a crutch used to make poor decisions or to validate one opinion versus another. The fundamental truth is that nothing with “big data” is really all that new, and that in almost all cases, the promises that you people are making have no basis in reality. It is vital that people understand core principles of statistics that will enable them to differentiate when data is being used in either of those two roles and to help maximize the value that data can bring to your organization.

So how then do you arm yourself to maximize outcomes and to combat poor data discipline? The key is in understanding key concepts of statistics, so that you can find when and how promises are made that cannot possibly be true. You do not need to understand the equations, or even have masterly level depth on most of these topics, but it is vital that you understand the truth behind certain types of statistical claims. I want to break down the top few that you will hear, and how they are misused to make promises, and how do you really achieve that level of success.

Correlation does not Equal Causation –

Problem– I don’t think anyone can get through college without having heard this phrase, and most can quote it immediately, but very few really focus on what it means. The key thing to take from this is that no matter how great your correlative analysis is it can not tell you cause of the outcome nor the value of items without direct active interaction with the data. No matter how much you can prove a linear correlation or even find a micro-conversion that you believe is success, by itself it can never answer even the most basic of real world business questions. They can be guiding lights towards a new revelation, but they can also just be empty noise leading you away from vital information. It is impossible to tell if you leave the analysis at just basic correlation, yet in almost all cases this is where people are more then happy to leave their analysis. The key is to make sure that you do not jump to conclusions and that you incorporate other pieces of information instead of blindly following the data.

Just because I can prove a perfect correlation between email sign-ups and conversion rate, that they both go up, I can never know from correlation alone if getting more people to sign-up for emails CAUSED more conversions, or if the people we got to convert more are also more interested in signing up for email. In a test this is vital because not only is it easy see those two points, but you are also limited with only a single data point making even correlation impossible to diagnose. It is incredibly common for people to claim they know the direction and that they need to generate more email signups in order to produce more revenue, but it is impossible to make that conclusion based on purely correlative information alone and it can be massively damaging to a business to point resources in a direction that can equally produce negative and not positive results.

The fundamental key is to make sure that you are incorporating consistent ACTIVE interaction with data, where you induce change across a wide variety of items and measure the casual value of them. Combined or leading your correlative information you can discover amazing new lessons that you would never have learned before. Without doing this the data that many claim is leading them to conclusions is often incomplete for fundamentally wrong and can in no way produce the insights that people are claiming. The core goal is always to minimize the cost of this active interaction with data while maximizing the number and level of alternatives that you are comparing. Failure to do this will inevitably lead to lost revenue and often false directions for entire product road maps as people leverage data to confirm their opinions and not to truly use data rationally to produce amazing results.

ExamplesMultiple success metrics, Attribution, Tracking Clicks, Personas, Clustering

Solution – Causal changes can arm you with the added information needed to answer these questions more directly, but in reality that is not always going to be an option. If nothing else, always remember that for any data to tell you what lead to something else, you have to prove three things:

1) That what you saw was not just a random outcome

2) That the two items are correlated with each other, and not just some other change

3) That you need to prove causal direction to be able to prove any conclusion

Just the very act of stopping people from not racing ahead or abusing this data to prove their own agenda will dramatically improve the efficiency of your data usage as well as the value derived from your entire data organization.

Rate vs. Value –

Problem – There is nothing more common than finding patterns and anomalies in your analytics. This probably is the single core skill of all analysis, yet it can often be the most misused or abuse actions taken with data. It can be segments that have different purchase behavior, channels that behave differently, or even “problems” with certain pages or processes. Finding a pattern or anomaly at best is simply the halfway point of actionable insight, not the final stop to be followed blindly. Rate is the pattern of behavior, usually expressed as a ratio of actions. Finding rates of action is the single most common and core action in the world of analytics, but the issue usually comes when we confuse the pattern we observe with the action to “correct” that action. Like Correlation vs. Causation above though, a pattern by itself is just noise. It takes active interaction and comparison with other less identified able options in order to validate the value of those types of analysis.

Just because Google users spend 4.34 min per visit or email users average visit depth is 3.4 pages are examples of rates of action. What this is not is the measure of value of those actions. Value is the change in outcome created by that certain action not the rate at which people happen to do things in the past. Most people understand “past performance does not ensure future outcomes” but they fail to apply the same logic when it comes to looking for patterns in their own data. Value is expressed as a lift or differentiation, things like adding a button increased conversion by 14% or removing our hero image generated 18% more revenue per visitor.

The main issues come from confusing the ability to measure different actions with knowing how to change someone’s behavior. The simplest example of this is the simple null hypothesis of what would happen if that item wasn’t there? Just because 34% of people click on your hero image which is by far the highest amount on your homepage, what would happen if that image wasn’t there? You wouldn’t just lose 34% of people, they would instead interact with other part of the page. Would you make more less revenue? Would it be better or worse?

It also comes down to two different business questions. At face value the only possible question you could answer with just pattern analysis is, “What is an action we can take?”, in the ideal value business case you would instead answer “Based on my current finite resources, what is the action I can take to generate the most X” where X is your single success metric. Rates of value have no measure of ability to change or of cost to do so, and as such they can not answer many of the business questions that they are erroneously applied to.

ExamplesPersonalization, Funnel Analysis, Attribution, Page Analysis, Pathing, Channel Analysis

Solution – The real key is to make sure that built into any plans of optimization you are incorporating active data acquisition and a that you are always measuring null assumptions and measuring the value of items. This information combined with knowledge of influence and cost to change can be vital, but without it is likely empty noise. There are entire studies in math dedicated to this, with the most common being bandit based problem solving. Once you have actively acquired knowledge, you then will start to build information that can start to inform and improve the cost of data acquisition, but never replace it.

These are but two of the many areas where people consistently make mistakes when leveraging data and concepts from statistics to make false conclusions. Data should be your greatest asset not your greatest liability, but until you help your organization make data driven decisions and not data validated decision there are always going to be massive opportunities for improvement. Make it a focus to improve your organizations understanding and interaction with each of these concepts and you will start using far less resources and making far better outcomes. Failure to do so also insures the opposite outcomes over time.

Understand data and data discipline have to become your biggest areas of focus and educating others your primary directive if you truly want to see your organization take the next step. Don’t let just reporting data or making claims of analysis be enough for you and you will quickly find that it is not enough for others.

Rant – Testing is about the Driver, not the Car

I recently answered a question on the value of testing on Quora and was asked to re-post my response here by a few people I know in the industry.

Question: I’ve been all about A/B testing, but then I just read this post from Erik Severinghaus. Is A/B testing as valuable as we think?

Answer: Like many things in life, the answer is not that simple. Think of it like driving a car, there are good drivers, slow drivers, oblivious drivers, angry drivers, and skill ranges from low to professional. The issue is in the driver, not in the concept of a car.

Testing is much the same way. The reality is that in many organizations (including many that champion testing to death) there is very little value in how they are leveraging testing. There are many cases where testing is actually costing those companies money, because they are not disciplined in how they approach things, they focus on idea validation and do not understand how to act on data, doing things like blindly following statistical confidence. If you look at how he describes testing in that blog post, then this is where those people are at. If MVT is simply a way to throw a bunch of items against a wall and choose a winner, then you know that you are firmly in this realm. In those cases, I would argue that testing is worse then a mouse pad, it is more analogous to a cup holder in a car. It is there, people use it, they get enjoyment out of it, but it has nothing to do with where the car ends up or how fast it gets there.

There are other organizations which look at data differently and who use testing in a different manner, one used to focus and leverage resources and one that is not used for validation or “choosing 2 headlines”. In those situations, there is very little that can be said to describe just how valuable testing is. Testing changes the direction of entire organizations, it proves people wrong, it focuses resources and it allows for the exploration of alternative feasible options and allows you to really know the value of actions, not just argue them. It is a tool whose use is to find out what the most valuable of many different routes are, and then help you drive down those roads providing more and more value at each step. Those scenarios are more analogous to testing being the GPS, describing routes and shorter distances, as well as helping maximize time and fuel.

In both cases, there are ways to automate the process to lower decision time and to increase the efficiency of the test itself. That doesn’t address the real problem however which is if the entire vision of testing is wrong, then it doesn’t matter what system you use to make decisions or how you leverage MVT. It really doesn’t matter what sized drink goes into the cup holder, or how many different drinks can be placed there over time. If you are going down the other thought route, then how fast your GPS updates, what information it uses, and what factors you use to decide routes can have a massive impact on where you end up.