There are many unique parts to optimizing on a lower traffic site, but by far the most annoying is an expected high level of variance. As part of my new foray into the world of lead generation I am conducting a variance study on one of our most popular landing pages.
For those that are not clear what a variance study is, it is when you do multiple variations of the same control and you measure all of the interactions against each other. In this case I have 5 versions of control which gives you a total of 20 data points (all 5 compared to the other 4). The point of these studies is to evaluate what the normal expected variance range is as well as the minimum and maximum outcomes from the range. It is also designed to measure this over time so that you can see when and where it normalizes down to as each site and page will have a normalization curve and a normal level of variance. For a large retail site with thousands of conversions a day you can expect around 2% variance after 7-10 days. For a lead generation site with a limited product catalog and much lower numbers, you can expect higher. You will always have more variance in a visit based metric system then a visitor based metric system as you are adding the complexity of multiple interactions being treated distinctly instead of in aggregate.
There are many important outcomes to these studies. It helps you design your rules of action including needed differentiation and needed amounts of data. It helps you understand what the best measure of confidence is for your site and how actionable it is. It also helps you understand normalization curves, especially in visitor based metric systems as you can start to understand if your performance is going to normalize in 3 days or 7. Assume you will need a minimum of 6-7 days past that period for the average test to end.
The most annoying thing is understanding all the complexities of confidence and how variance can really mess it up. There are many different ways to measure confidence, from frequentest to Bayesian and P-Score to Chi Square. The most common ways are Z-test or T-Test calculations. While there are many different calculations they all generally are supposed to tell you very similar things. The most important of which is what is the likelihood that the change you are making is causing the lift you see. Higher confidence means that you are more likely to get the desired result. This means that in a perfect world a variance study should have 0% confidence and you are hoping for very low marks. The real world is rarely so kind though and knowing just how far off from that ideal is extremely important to knowing how and when to act on data.
This is what I get from my 5 experience variance study:
To clarify, this is using a normal Z-Test P-Score approach and there are over the bare minimum conversions that most people recommend (100 per experience). This is being done through Google Experiments. The highest variance I have ever dealt with on a consistent basis is 5% and anything over 3% is pretty rare. Getting an average variance of 11.83% after 5 days is just insane:
This is just not acceptable. I should not be able to get 97% confidence from forced noise. It makes any normal form of confidence almost completely meaningless. To make it worse, if I did not do this type of study or if I did not understand variance and confidence then I can easily make a false positive claim from a change. These types of errors (both type 1 and type 2) are especially dangerous because it allows people to claim an impact when there is not one and allow people to justify their opinions through purely random noise.
If you do not know your variance or do have never done a variance study, I strongly recommend that you do so. They are vital to really making functional changes to your site and will allow you to avoid wasting so much resources and times on false leads.
So much has changed in my world recently and I wanted to give everyone a heads-up. After 5+ years trying to fix some of the largest and most complicated organizational optimization issues I have stepped away from Adobe and have decided to go in a somewhat new direction. I have taken a position as Director of Optimization for a small company in the Carlsbad, CA area called Questpoint where I will be overseeing optimization of a number of lead gen situations.
What this means is that I now deal with much smaller but much more meaningful measures of success. It also means that I can now talk much more directly about the challenges I face and the solutions as they present themselves to me. I will continue to investigate the theoretical challenges of optimization but will also be more directly talking about the realities of testing on a budget. I will be using a number of tools including Google Analytics and Google Experiments and will be breaking down the advantages and disadvantages of them in comparison to the enterprise level tools that I was familiar with.
Here is to the new path before me and here is to the many barriers and hills one must climb to bring that boulder to the top of the mountain.
I was recently lucky enough to get to take 6 weeks off for a sabbatical, where I was able to really get away from the life of a consultant and just spend time with my family. Upon returning I reached out to a number of people with whom I work or know in the industry and wanted to catch-up. I had two different people, both of whom I respect deeply and who I think are some of the best and brightest in the industry, regal stories about how fed up they were with the industry and how they were losing faith in the system. This conversation is hardly new, but the stark difference between the “real” world and what happens in the corporate world was striking.
Without fail everyone comes to realize the limitations of the corporate world, and while every organization is different, the things you see, especially at the largest corporations, are almost all universal. One of the hardest things I have to deal with as I try and mentor new people or people who I want to help in the industry is help them really come to grips with this reality and help them see that there is hope, but that they will always be making choices: what is good for the company vs. what is good for them.
With that in mind, I wanted to help express some universal truths that I think everyone should be comfortable with if they want to really exist for any period of time in this business world.
Most effort is wasted – This becomes striking clear when you start doing exploratory casual analysis and look at the impact of work or entire departments. The number of times in the last 5 years that I have taken a few minutes of effort and the end result has shown that entire years’ worth of effort had negative impacts to the bottom line cannot be counted on my appendages. There are entire disciplines that people have devoted their entire life to that have no impact whatsoever and are nothing more than phrenology.
But 100% of people think they do excellent work – This really hit home today with one of the conversations as the realization that action is confused with value really came home. Most people assume their actions are providing value, and because of the preponderance of data out there, most can find ways to come up with some story to justify their actions.
I have had multiple engagements where it started with the person showing reports and graphs and presentations showing massive value to the program, only to take a few minutes and dive into the numbers and show that not only was it not improving the business, in multiple cases they were actually causing catastrophic harm to the business. It happens everywhere, at least 70% of case studies are full of 100% fake data. People are so desperate to please their boss or make themselves look good they find ways, often subconscious, to show their value. It is not malicious, it is just sociopathic.
My favorite story of this was when I was working with a group that was reporting how great their recommendation tool was and how it was generating 18% more revenue! In reality they were only looking at the revenue increase of the products recommended (3 out of a library in the hundreds). When doing just a standard analysis of the entire revenue stream, there were piles of data that showed they were losing 6% net revenue for the entire company, totaling millions and millions of dollars.
People are not rewarded if a company makes 3% more money from their actions, and as such they treat the lack of complaining as the ultimate sign of success. We never look at what could have been, only what was and how people reacted to it. Context is something you have to strive for and work hard to get, every day, otherwise almost all stories and data is meaningless. This often leads to many long term problems…
Because of this most people have no clue what they are talking about – If you can come up with a story to justify any action, you don’t dive to see if you are right, most effort is wasted, and the only thing people look at are the stories you weave, why would you need to know what you are talking about? It is far easier to create a narrative out of the air then it is to actually be able to back up anything you are talking about. Thanks to fear, Dunning-Kruger, and just common greed, this is allowed to take place. The more someone is able to convince people of their story (which is a different skill then actual results), the more they move forward, the more people believe it, and the more people want to copy it. It is a self-fulfilling cycle and one where actual knowledge is scorned because it serves as a direct challenge to the empires built by these people. If you really want to improve things, you must always make people go where they don’t want to, because the safe shores are the ones without accountability and that sound like the same things they have always been doing.
People who build the tools and work at agencies often know even less – I have direct experience with a large number of tools, and I am lucky enough to know “thought leaders” at a large number of other tools and agencies, and I can tell you as a whole most of the people you hear talking couldn’t provide value to you if their lives depended on it. They have become experts at telling you what you want to hear, not telling you what you need to hear. The top people in the industry are story tellers who weave a tail of telling you to do basically what you have been doing, but justify it with fancy terms or new actions to get to the same place. Tools become designed for this, people get advancements for this, and oftentimes anyone who doesn’t want to take part of this vicious cycle move on to other endeavors, meaning the worst become the ones their longest, gaining power and only making the cycle worse. I can attest that the top 5 agencies I know in my space, I know multiple top people at all of them and not a one knows anything about providing value, or even cares. They do the same tired actions because that is what they have always done, and they don’t get called out on it, so why should they change? To quote one famous person in our industry, “I throw a grenade and try to get people to come to me when they run from it.”
And you thank them for it – You know why the tools are designed to do things that aren’t valuable and the top agencies are run by people who tell stories and have no clue what they are doing?
Because you do not hold them accountable
I have worked with exactly 3 organizations in 12 years where results mattered, the rest just want to sell a story internally, do something new, and then do more actions that make their boss happy. You buy the story, do the failed actions, sell that story internally, which results in promotion which propagates the cycle. The cycle spreads and just as stated before, knowledge of other ways is simple a risk. This is why you find people in these places that are so good technically, but very few if any in most organizations that have any clue about strategy other than repeating the same tired failing things that everyone else repeats. Organizations want people to do what they say and tell them it is golden, not to make them money. The only person who is going to really hold you accountable for value derived is yourself.
But all hope is not lost – This environment is where we exist, and it has been that way since you started and will be far after you are done working. The environment doesn’t change, so it is up to you to decide how to deal with it. Just because people don’t want to change doesn’t mean they won’t, it just means it isn’t easy. Just because others don’t hold you accountable does not mean that you can’t. Just because doing what others want will help you move forward, it doesn’t mean that you have to sell out at all opportunities. It is a balancing act, both of survival and how to tackle these complex problems. The thing that makes people survive or be good is that they don’t hide from the reality, they embrace it, they might get frustrated, but they come back and push back even harder tomorrow. If you give in, if you become cynical, if you just give up and take the easy path, that is your decision, just as it is to do the right thing even if it is not best for your career. No one can tell you what the right choice is, all they can do is help you see that you are making these choices and help you make the one that is best for you.
And when you do overcome these problems it can be the ultimate high – Just because the entire system might be designed to keep things from progressing does not mean that progress isn’t made, only that it is rare and incredibly hard fought. I had the pleasure to also be on a call today with a client who has come from an org with no background in testing, who just threw up tests because they thought they should and who had no resources and no knowledge, who in the last 9 months has transformed to the point that they have a separate team, great discipline, good educational base, and who is running a series of exploratory tests. That moment, which I wish happened all the time but doesn’t, makes it all worth it, at least for me.
In the end, it isn’t about what title you have, who thinks you did well, it is about what are you trying to accomplish and did you hold yourself accountable to it? People can do good work, almost all of the problems I outlined above happen subconsciously, not consciously. People aren’t out to screw each other; they just do it and then rationalize it away. Opening up someone’s eyes, or making it so they don’t have to do the easy thing just because is all you can do.
Choose what you want, and then do it. Don’t let the system dictate the outcome, it is up to you to overcome, adapt, or become a cog in that machine.
I recently answered a question on the value of testing on Quora and was asked to re-post my response here by a few people I know in the industry.
Question: I’ve been all about A/B testing, but then I just read this post from Erik Severinghaus. Is A/B testing as valuable as we think?
Answer: Like many things in life, the answer is not that simple. Think of it like driving a car, there are good drivers, slow drivers, oblivious drivers, angry drivers, and skill ranges from low to professional. The issue is in the driver, not in the concept of a car.
Testing is much the same way. The reality is that in many organizations (including many that champion testing to death) there is very little value in how they are leveraging testing. There are many cases where testing is actually costing those companies money, because they are not disciplined in how they approach things, they focus on idea validation and do not understand how to act on data, doing things like blindly following statistical confidence. If you look at how he describes testing in that blog post, then this is where those people are at. If MVT is simply a way to throw a bunch of items against a wall and choose a winner, then you know that you are firmly in this realm. In those cases, I would argue that testing is worse then a mouse pad, it is more analogous to a cup holder in a car. It is there, people use it, they get enjoyment out of it, but it has nothing to do with where the car ends up or how fast it gets there.
There are other organizations which look at data differently and who use testing in a different manner, one used to focus and leverage resources and one that is not used for validation or “choosing 2 headlines”. In those situations, there is very little that can be said to describe just how valuable testing is. Testing changes the direction of entire organizations, it proves people wrong, it focuses resources and it allows for the exploration of alternative feasible options and allows you to really know the value of actions, not just argue them. It is a tool whose use is to find out what the most valuable of many different routes are, and then help you drive down those roads providing more and more value at each step. Those scenarios are more analogous to testing being the GPS, describing routes and shorter distances, as well as helping maximize time and fuel.
In both cases, there are ways to automate the process to lower decision time and to increase the efficiency of the test itself. That doesn’t address the real problem however which is if the entire vision of testing is wrong, then it doesn’t matter what system you use to make decisions or how you leverage MVT. It really doesn’t matter what sized drink goes into the cup holder, or how many different drinks can be placed there over time. If you are going down the other thought route, then how fast your GPS updates, what information it uses, and what factors you use to decide routes can have a massive impact on where you end up.