There are many challenges for anyone entering a new field of study or a new discipline. We are all coming into any new concept with all of our previous held knowledge and previous held beliefs filtering and changing how we view the new thing before us. Some choose to make it fit their world view, others dismiss it from fear, and others look for how it can change their current world view. Usually in these situations I quote Sherlock Holmes, “It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.” Nothing represents this challenge more in online marketing then the differences between analytics and optimization, and nothing represents that struggle more than the debate about visit based measurement versus visitor based measurement.
The debate about should someone use visits, impressions, or visitor basis for analysis is a perfect example of this problem, as it is not as simple as always use one or the other. When you are doing analytics, usually visits are the best way to look at data. When you are doing optimization, there is never a time where visits would present to you more relevant information then using a visitor based view of the data.
Analytics = Visit
Optimization = Visitor
The only possible exceptions are when you are using adaptive learning tools. While the rules can be simple, a deep understanding of the way presents many other opportunities to improve your overall data usage and value derived from every action.
Since most people reading this start in an analytics background, let’s look at what works best in that environment. Analytics is a single data set correlative data metric system, which is a long way of say, it counts things on a consistent basis and only one set of data, even if that data has many different dimensions. You are only recording what was, not what could or should be. In that environment, you have to look at data in some very particular ways. The first amongst those is a very tight control on accuracy, since in many cases the use of that data is to represent what the business did, and to hopefully make predictions about the future.
It is also important that you are consistent with how you measure and that you look at things in a common basis. Because most people are comfortable looking at a day or shorter term basis, this means the easiest method is going to be a visit. It is works great because you are trying to look at interactions and to measure in a raw count of things that did happen, e.g. how many conversions, or how many people came from SEO. In those cases, a raw count in a correlative area is going to be best represented using a visit basis, since it mitigates lost data (though it is not a massive amount) and it best reflects the common basis that people look at data.
In the world of optimization however, you have a completely different usage and type of data. In optimization we are looking at a single comparative data point, and trying to represent an entire different measure, which is influence on behavior over time. It doesn’t matter if your site changes once a year or once an hour, or if your buying cycle is 1 visit or 180 days, all of those things are irrelevant to the fact that you are influencing a population over time. Because behavior is defined as influence on a population, and because we are looking comparatively over time, the measurement techniques used in analytics need to be rethought. Any concern about accuracy, past a simple point, become far less important than a measure of precision (consistency of data collection) since all error derived is going to be equally distributed. It doesn’t matter if the common basis is $4.50 or $487.62, what matters is the relative change based on the controlled factor. It is also important that we are focusing far more on the influence then the raw count, which means we are really talking about the behavior of the population.
In analytics you are thinking in terms of, what was the count of the outcome (rate) as opposed to in optimization the focus is on what was the influence (value). To really understand optimization, you have to understand that all groups start with a standard propensity of action which is represented by your control group. If you do nothing the people coming to your site, people in all stages and all types of interaction, measure up to one standard measure across your site (though all measurement systems do have internal variance in a small degree). Since we are measuring not what the propensity of action is but what are ability to positively or negatively influence it is, we need to think in terms of reporting based on visitors and based on the change (lift) and not the raw count.
You also have the case of time, where we need to measure total impact over time. While it is correct that every time a visitor hits your site you have a chance to influence them, it is important to remember that the existing propensity of action measurement already accounts for this. What we are looking for is a simple measure of what did we accomplish by in terms of getting them to spend more. This means that we have to think in terms of both long and short term behavior. Some people will purchase today, some 3 visits later, but all of that is part of standard business as usual. It is incredibly easy to have scenarios where you get more immediate actions but less long term actions. This means that on a daily basis you might see a short term spike, but for the business overall you are going to be making actually less revenue. This possibility creates two possible measurement scenarios:
1) There is no difference between short term and long term behavior, meaning the short term spike continues through and is positive also in the long term. In this scenario the only way to know that is to look at the long term.
2) There is a difference and short and long term behaviors differ and we are getting a different outcome by looking at the visitor metric over time. In this scenario the only positive outcome for the business is the visitor based metric view.
In both cases the visitor based metric view gives us the full picture of what is good for the business, while the visit based metric system either has no additional value or a negative value by reaching a false conclusion. In either case the only measure that adds value and gives us a full picture is the visitor based view of the world. We have a case where visitor is both the most complete view, no matter the situation, but the only one that can give you a rational view of the impact of a change. To top it off, the choice to only look at the shorter window creates a distribution bias, by valuing short term behavior over long term behavior, which may create questions into the relevance of the data used to make any conclusion.
The visitor vs. visit based view of the world is just one of many massive differences that reduce the value derived from optimization if not understood or not evaluated as a separate discipline. Because it is so easy to rationalize sticking with what is comfortable, it is common to find this massive weakness being propagated throughout organizations with no measure of what the cost really is. While not as damaging as others, like not having a single success metric or not understanding variance, it is vital that you are thinking about visit and visitor based data as attached the end goal and not as a single answer to everything.
In the end, the debate about which version to use is not really one about visits or visitors, there are clear reasons to choose visits for analytics and visitor for optimization. The real challenge is if you and your organization understand the different data disciplines that are being leveraged. If you constantly look for different ways to think about each action you will find new and better ways to improve value, if you fail to do so you will cause damage throughout your organization and will not even know you are doing it.
You can’t go five minutes in the current business world without the terms big data, predictive or statistical tool being thrown about. If one was to believe all of the hype you would have no problems making perfect decisions, acting quickly, and all everyone would be improving their performance by millions of dollars every hour. Of course everyone in the field also acknowledges just how far everyone else is from that reality, but they fail to mention the same errors in logic from their own promises and their own analysis. All data is leveraged using mathematical tools many of which do not have the level of understand that are necessary to maximize their value. Data can both be a powerful and important aid to improving business and a real deciding factor between success and failure. It can also be a crutch used to make poor decisions or to validate one opinion versus another. The fundamental truth is that nothing with “big data” is really all that new, and that in almost all cases, the promises that you people are making have no basis in reality. It is vital that people understand core principles of statistics that will enable them to differentiate when data is being used in either of those two roles and to help maximize the value that data can bring to your organization.
So how then do you arm yourself to maximize outcomes and to combat poor data discipline? The key is in understanding key concepts of statistics, so that you can find when and how promises are made that cannot possibly be true. You do not need to understand the equations, or even have masterly level depth on most of these topics, but it is vital that you understand the truth behind certain types of statistical claims. I want to break down the top few that you will hear, and how they are misused to make promises, and how do you really achieve that level of success.
Correlation does not Equal Causation –
Problem– I don’t think anyone can get through college without having heard this phrase, and most can quote it immediately, but very few really focus on what it means. The key thing to take from this is that no matter how great your correlative analysis is it can not tell you cause of the outcome nor the value of items without direct active interaction with the data. No matter how much you can prove a linear correlation or even find a micro-conversion that you believe is success, by itself it can never answer even the most basic of real world business questions. They can be guiding lights towards a new revelation, but they can also just be empty noise leading you away from vital information. It is impossible to tell if you leave the analysis at just basic correlation, yet in almost all cases this is where people are more then happy to leave their analysis. The key is to make sure that you do not jump to conclusions and that you incorporate other pieces of information instead of blindly following the data.
Just because I can prove a perfect correlation between email sign-ups and conversion rate, that they both go up, I can never know from correlation alone if getting more people to sign-up for emails CAUSED more conversions, or if the people we got to convert more are also more interested in signing up for email. In a test this is vital because not only is it easy see those two points, but you are also limited with only a single data point making even correlation impossible to diagnose. It is incredibly common for people to claim they know the direction and that they need to generate more email signups in order to produce more revenue, but it is impossible to make that conclusion based on purely correlative information alone and it can be massively damaging to a business to point resources in a direction that can equally produce negative and not positive results.
The fundamental key is to make sure that you are incorporating consistent ACTIVE interaction with data, where you induce change across a wide variety of items and measure the casual value of them. Combined or leading your correlative information you can discover amazing new lessons that you would never have learned before. Without doing this the data that many claim is leading them to conclusions is often incomplete for fundamentally wrong and can in no way produce the insights that people are claiming. The core goal is always to minimize the cost of this active interaction with data while maximizing the number and level of alternatives that you are comparing. Failure to do this will inevitably lead to lost revenue and often false directions for entire product road maps as people leverage data to confirm their opinions and not to truly use data rationally to produce amazing results.
Examples – Multiple success metrics, Attribution, Tracking Clicks, Personas, Clustering
Solution – Causal changes can arm you with the added information needed to answer these questions more directly, but in reality that is not always going to be an option. If nothing else, always remember that for any data to tell you what lead to something else, you have to prove three things:
1) That what you saw was not just a random outcome
2) That the two items are correlated with each other, and not just some other change
3) That you need to prove causal direction to be able to prove any conclusion
Just the very act of stopping people from not racing ahead or abusing this data to prove their own agenda will dramatically improve the efficiency of your data usage as well as the value derived from your entire data organization.
Rate vs. Value –
Problem – There is nothing more common than finding patterns and anomalies in your analytics. This probably is the single core skill of all analysis, yet it can often be the most misused or abuse actions taken with data. It can be segments that have different purchase behavior, channels that behave differently, or even “problems” with certain pages or processes. Finding a pattern or anomaly at best is simply the halfway point of actionable insight, not the final stop to be followed blindly. Rate is the pattern of behavior, usually expressed as a ratio of actions. Finding rates of action is the single most common and core action in the world of analytics, but the issue usually comes when we confuse the pattern we observe with the action to “correct” that action. Like Correlation vs. Causation above though, a pattern by itself is just noise. It takes active interaction and comparison with other less identified able options in order to validate the value of those types of analysis.
Just because Google users spend 4.34 min per visit or email users average visit depth is 3.4 pages are examples of rates of action. What this is not is the measure of value of those actions. Value is the change in outcome created by that certain action not the rate at which people happen to do things in the past. Most people understand “past performance does not ensure future outcomes” but they fail to apply the same logic when it comes to looking for patterns in their own data. Value is expressed as a lift or differentiation, things like adding a button increased conversion by 14% or removing our hero image generated 18% more revenue per visitor.
The main issues come from confusing the ability to measure different actions with knowing how to change someone’s behavior. The simplest example of this is the simple null hypothesis of what would happen if that item wasn’t there? Just because 34% of people click on your hero image which is by far the highest amount on your homepage, what would happen if that image wasn’t there? You wouldn’t just lose 34% of people, they would instead interact with other part of the page. Would you make more less revenue? Would it be better or worse?
It also comes down to two different business questions. At face value the only possible question you could answer with just pattern analysis is, “What is an action we can take?”, in the ideal value business case you would instead answer “Based on my current finite resources, what is the action I can take to generate the most X” where X is your single success metric. Rates of value have no measure of ability to change or of cost to do so, and as such they can not answer many of the business questions that they are erroneously applied to.
Examples – Personalization, Funnel Analysis, Attribution, Page Analysis, Pathing, Channel Analysis
Solution – The real key is to make sure that built into any plans of optimization you are incorporating active data acquisition and a that you are always measuring null assumptions and measuring the value of items. This information combined with knowledge of influence and cost to change can be vital, but without it is likely empty noise. There are entire studies in math dedicated to this, with the most common being bandit based problem solving. Once you have actively acquired knowledge, you then will start to build information that can start to inform and improve the cost of data acquisition, but never replace it.
These are but two of the many areas where people consistently make mistakes when leveraging data and concepts from statistics to make false conclusions. Data should be your greatest asset not your greatest liability, but until you help your organization make data driven decisions and not data validated decision there are always going to be massive opportunities for improvement. Make it a focus to improve your organizations understanding and interaction with each of these concepts and you will start using far less resources and making far better outcomes. Failure to do so also insures the opposite outcomes over time.
Understand data and data discipline have to become your biggest areas of focus and educating others your primary directive if you truly want to see your organization take the next step. Don’t let just reporting data or making claims of analysis be enough for you and you will quickly find that it is not enough for others.
As personalization and testing continues to become more and more mainstream you are starting to see a whole slew of groups that are being introduced to testing, or who may believe they have more functional optimization knowledge then reality. So many groups would get be better off if they just avoided a number of common pitfalls that befall new programs. While earlier I put together my list of the top 7 deadly sins of testing (no single success metric, weak leadership, failure to focus on efficiency, testing only what you want, confusing rate and value, and falling for the graveyard of knowledge), I want to instead give you a very quick checklist to make sure that at least your first few efforts in testing are not more painful and far more fruitful then what normally happens.
It does not matter if you have tested before, or are new to testing. What matters instead is how you tackle today’s issues and how you set yourself up to succeed. Breaking down the components of a successful first test allows you see the action items, and allows you to move towards the moments that really make or break a program. Nothing makes everyone’s life harder than starting out testing on the wrong foot, sine everyone will think that is how things are going to happen from then on. With that in mind, if you do nothing but follow this simple checklist, you are going to be far better off.
1) Decide on what you are trying to accomplish – Pick one metric for all your tests, across your entire site or org that defines success. This might be easy or hard, but it is vital. This is especially difficult for people coming from an analytics background who are used to reporting on multiple things and finding stories.
2) Pick one or two pages to start – Do not try to eat the entire elephant in one sitting.
3) Do not focus on your test ideas – You are going to have test ideas in mind, and you are going to want to only talk about and focus resources on that one test. Without fail this is where most groups want to focus, but I cannot stress enough how not important your concept for improvement will be to a successful program.
4) Make sure your first test is very simple – Don’t start trying to do a full redesign of your entire site or completely change a user flow. Pick one page and one thing to start. If you are not sure, then pick a page and design your first test to measure the relative value of all the items on the page.
5) Decide on your initial segments – Make sure all tests have a segment list. Not being focus on a specific segment will make it far easier to find exploitable segments and will start the process of learning even if you do not intend to use them right away.
Here are some basic rules for segments to make your life easier:
• Must be at least 5% of your total population (10% of a smaller site). This is total, not identified traffic
• Must have a comparable measure (you can’t measure new users versus Google users, since there are new Google users).
• Have to be defined PRIOR to the start of any campaign
• Need to cover the 4 big user information types (the listed items are just examples):
Day of week
Operating system (basically all the stuff you get from the user agent string)
How did the user get to the site
Key word types
Used internal search before
6) Start building a proper infrastructure – Build out a technical frameworkfor your entire site for testing. This may not be tied to the first test, thought that will be part of it. Get access so that you can run tests on 80% of your pages, using a combination of local and global set-up. A little pain up front will save you from lots of pain later on. It is always best to avoid death by a thousand cuts whenever possible, even if you don’t see that issue immediately.
7) Decide on rules of action – Make sure everyone is very clear on how you are going to call a test and act on winners before you launch your first test.
8) Making sure you are not going to QA tests like you do other parts of your site – So many testing programs are destroyed by letting IT decide on the work flow and on how you are going to QA your tests. You may have to work with your IT group, but it is vital that testing is not owned by your IT group but instead your marketing and product teams.
9) Point your resources towards feasible alternatives, not adding more metrics – Any additional time and additional resources need to be used on the following two things:
A) Adding as many alternatives to the test as possible, traffic permitting
B) Educating and working with groups to make sure they understand why you are testing and how you are going to act.
10) Remember that the most important time for your program is not the first test – Your first test is fun to focus on, but the period after your first test is where you can start to really focus on testing discipline. This is the time that defines whether you get good at just running tests, or if you build a rock star optimization program. So many groups fail because they miss how vital this time is.
There you go, 10 simple steps to make sure that your first moments in testing do not take your program astray. This is hardly the end of the road, but if you simply avoid setting yourself up for failure, then you can really start to look ahead to all the opportunity that is out there.
There is a lot of misunderstanding on the difficulty of using Adobe Target. There are all sorts of uses of the tool, and in a lot of cases, people confuse their inability to set-up a proper infrastructure with limitations of the tool. The wonderful thing about Target is that it has 5000 different uses. The difficult thing about Target is that it has 5000 different uses. Figuring out what to do, when to use it, and how to tackle that problem is paramount to a mature program.
Before we get to specific suggestion however, I want to make it clear what I consider a proper technical infrastructure. It comes down to how you answer this question? Can you get 80% of all tests live, with the proper amount of variations, live in 30 min or less. That is from concept to launch. That number seems crazy to people at first, but the reality is that you can get a test fully live in 5 min if you have everything ready to go. That is the sign of a successful infrastructure: speed of execution, understanding and flexibility. In order to do that, you have to have things set-up in a way that will make you successful. Here are the key things to do and avoid in that quest for speed and execution without sacrificing (in fact improving) results.
Do – Make sure that you can test on all major part of your site today
Make sure that you have mboxes on all key parts of your site. You will most likely want to make sure that you have multiple mboxes on those key pages, usually leveraging a mixture of global and wrapped mboxes throughout the page. Global mboxes are ones that do not wrap any content, but exist at the very top of the page to control CSS, redirects, and other functions. Wrapped mboxes are ones that wrap key elements of the page, with the best practice of never wrapping more then 1/3rd of the viewable page with a single mbox. The key here is to make sure you are not adding and removing mboxes for each test. One of the nice things is that even if you do have those mboxes on the page, you can disable them easily in the console so that you are not charged for them.
Don’t – Have more then 5-6 mboxes on any page
Just because you want to prepare a page to succeed does not meaning going crazy with mboxes. The reality is that each request does impact site performance, and any noticeable interaction with the page will result in the validity of a test result being highly questionable. Setting up a page to be able to do 80% of the things you can do is important, just as prioritizing those efforts to leverage what you can do today to learn about the other 20% that require additional resources. If you can do 80% of possible things today, then do those things, and you will find that often the tests you are not sure about are the ones that produce the biggest and most meaningful winners.
Do – Make sure that you are tracking key information about your users
This is both an on-page and in console element and may require a bit more time and discipline to think about before you act on anything. This means leveraging one of the most under-utilized but extremely valuable parts of the tool in the profile system. Make sure that you are passing key information you might have from your CMS or other systems about your users where necessary, and also make sure that you are recording key pieces of information in the profile system. The key here is to look into it, but not believe it blindly, to search for value and not predetermine outcomes. Just because you really want to target to a certain membership level does not mean you should ever target to them before you test and measure the value of that change compared to other alternatives. Even if you aren’t leveraging it today, it can be leveraged in the future, and it allows you to easily add that information for analysis later.
Don’t – Delay your infrastructure for unnecessary pieces of information
Most groups tackle optimization as just the end function of a number of different inputs. The worst examples of this is when groups go through and interview for a list of all possible pieces of data a group may want to look at or segment information by. While it might make your life easier since you don’t have to correct anyone, the reality is that you are most likely causing massive damage to your program. It is a good idea to get a read on what people may want, but the primary directive should be to help groups think and act differently with data. As part of that, you need to ask the fundamental question of “will the data I am collecting improve our business enough to be worth it, and even so, can I get better outcomes from using those same resources?” There will be cases when the answer is yes, but in most cases if you are honest the answer will be a resounding no. In those cases, do not delay other actions just to make others happy. Use what you have to get the most with what you got, don’t waste time waiting for everything to be perfect.
Do – Make sure that you can get tests live without IT involvement.
You should be able to do 80% of tests with some very basic understanding of how a webpage works. After you get your infrastructure in place, you do not need IT support except in one off cases. You should not need IT except I extreme circumstances and only then once you have proven via discovery that the item you are using those resources on is the most efficient use of everyone’s time. Where groups run into major problems is when they view each test as a technical project, which gives both the wrong impression about the difficulty of testing and makes things much slower to react then they should be. Even worse are groups that think that technically complex means valuable, which is almost always the opposite of the truth. Difficult means difficult, valuable means valuable, those two things do not have much to do with the other.
Prioritize tests by how much effort they are to get live, as well as how many different feasible alternatives you will get, and how much they challenge assumptions. Do that, and you will dramatically improve the outcomes and reduce the resources needed for your program.
Don’t – QA testing like you do everything else on your site
There is nothing worse for testing then having a test ready to go and then having to wait weeks for it to go through a standard QA process. Make sure you do a good job of QA, but keep in mind that nothing will ever be perfect, and that you can get most QA done for most tests by simply passing around a few QA links to a couple of colleagues and having everyone use a couple of browsers. You will most likely want to do more QA for efforts that dramatically change site function, but the reality is that you should be fewer of those tests and far more of the more basic tests anyways. Add better rights control so that fewer people can then push things fully live also adds meaningful limits to make sure you are accomplishing what is needed without sacrificing speed and efficiency.
The promise of the tool is that marketers start having control of their site, so why then do people so willingly give up that control? You need to be able to understand why you need to QA differently, and how different types of tests impact site function, so that you can have a meaningful conversation with yoru tech team and help them understand that you are not going to be constantly breaking their site. Be thorough, but don’t go crazy or just dump the responsibility onto others.
Do – Make sure that everyone is aware of when you are running a test
One of the benefits of setting up your campaigns to use QA parameters is that those links can be shared with everyone and you can therefor make more people aware of a campaign. There is nothing worse than having a campaign live for a few days and then having to stop it because some VP randomly hit a test variant and thinks the site is broken. Communicating both launch and results and especially lessons learned across tests is also part of a successful infrastructure.
This will also help build awareness and interest in the results of the tests. If you design tests with enough variants and challenge enough assumptions, in almost all cases you will get results that make other groups fundamentally challenge their own ideas of what works.
Don’t – Ever launch a test without a clear idea of how you are going to act on the data
There is no point in any test if you are not clear on how you are going to act. This means knowing your single success metric, but also what happens to push a winner, or how you are going to follow-up, or what you will do with segment information. Having those conversations outside of the specifics of a test and having groups aware of their roles before the launch is vital to a speedy and successful infrastructure.
Being prepared to act and having your groups ready to act quickly and efficiently can many times be against the entire history of some organizations. This is why it becomes so vital to build a proper infrastructure on every level to enable testing to work. Focusing on what matters, instead of just how to get a single test live, is one of the key differences between groups who just test and those that run successful testing programs. Make sure that you focus on the things outside of a specific test and help move the various groups involved towards an end goal, and you will be amazed at how fast you will act and the results you will achieve.