February 8, 2012

Testing 303 – Advanced Optimization Paradigms – Part 2

In the first part of our look at advanced paradigms, I focused on the complex interplay of testing and other parts of your organization. As testing grows, it starts to interact on a nearly daily basis with every part of your organization. If you look at the evolution that we have taken, going from the very fundamental building blocks of a testing program, to the ways we look at tests and testing, and finally to the complex interactions of testing into everything, we have shifted the importance and the value that testing brings. The final stage of evolution is to start evaluating your own core beliefs of even what is a testing program, data, and even how we view the world. It is easy to challenge others to grow, but the most difficult and most rewarding changes always start from within. If the evolution starts with getting people to align, it ends with changing our fundamental beliefs about data. We have to ask extremely difficult questions and challenge our own interactions, breaking down our beliefs and rebuilding them to strengthen and evolve.

To that end, here is the final look at advanced optimization paradigms:

No More Focusing on Test Ideas –

If we view optimization as a discipline, one that never starts and never ends, one that is about the constant changing and learning of a user experience, then there is no longer any need for individual test ideas. An idea naturally has a start and an end, as any hypothesis comes from a belief in a specific solution to an existing problem. People get so caught on their idea, be it from their own experience, some piece of data that they just know means they have the solution to all your problems, or just “best practices” that their brains shut down and they stop trying to find the best answer. The problem here is the entire process leads to a myopia that gives us the right to stop, the ability to prove ourselves “right” and the natural affinity towards a set path.

If the focus is no longer on test ideas, however, then the system is what you focus on. If we are instead treating things as a cycle: explore, learn, ideate on all feasible alternatives, execute, learn, repeat, then there really is no individual campaign or test. The path is never about what you think will win, or what you want to tackle, but instead only on where the casual data leads you and the evaluation of all feasible alternatives. Test ideas become the least important thing you can discuss, and should be viewed with high levels of skepticism. There is no such thing as a good test idea, only a concept that can be broken apart, challenged, and improved. Fear any expert trying to tell you a single great test idea, or any guaranteed set of steps to improve your site, as they are only playing to your own insecurities; the reality is any single idea can not hold up to scrutiny. This impacts your own beliefs as much as any other, you have to hold yourself to the same level of scrutiny, not allowing what you want to happen to be the path that you go down or the answers you seek. You have to be willing to step outside your own opinion and be able to focus on all feasible alternatives and only what the efficiency of changes are; any bias that you allow to limit what you test, be it because of experience or popular opinion, devalues the outcome that you can generate.

The biggest challenge most people have is the feeling of a losing of control. We often like to blame other groups for this behavior, but by far the most guilty group are analysts who are so busy trying to prove a point with their data that they fail to see the larger picture. They so want to prove a path using their analytics that they fail to factor in the need to change to an active form of data acquisition in order to move forward. You have to worry about your own biases before you can stop others. It is easy for all groups to get focused on what their experience or gut tells them is right, often to poor and inefficient outcomes. Make it clear that no idea stands alone. Put in place measures to insure that you are not limited to popular opinion or only what you think or want to win. This often means that you have to prioritize resources in ways that you are not doing today, but ultimately this is the only way to insure you are getting the greatest value and insuring your own continued education about what the value of actions are.

Free yourself from the cycle of defending and pushing every idea, instead creating momentum and a consistent pattern of action. Everyone is afraid of moving towards the infamous 48 shades of blue extent of this path, but the reality is it frees you. You no longer need consensus and you can push the boundaries of what you try. Once you have gotten to blue as the most important element, you may, depending on your feeling for the N-Armed bandit problem, want to test out to 48 different variations, but that is not an affront to you. People built the system, fed the system, and control the system. Once you get to the point that you know what you need to know, why not let the system provide the answer for you? The system is only as valuable as the people who feed it, yet we fear the system and we fear becoming lost to the system.

Moving down this path, of avoiding individual ideas or about trying to find the perfect solution allows you to re-imagine and recreate who and what you are on the fly, without massive redesign efforts. It allows you to avoid holding anything sacred on the site or about ever worrying about the entire concept of “right”. The user experience becomes a fluid thing, where the true value of data, your creativity, and the ability to move past your own biases, determines the magnitudes of growth that you will experience. The true value of the individual is in how well they feed the system. The system is only as valuable as what goes into it and by democratizating all ideas; by forcing the conversation away from ideas and towards feasible alternatives, you are giving more value to the creative freedom of the members of your organization.

Optimization powered Analytics –

Let me pose a theorem to you: Analytics, by itself, is completely worthless.

Let me challenge you by looking at the entire current practice of analytics as nothing more than hubris. That the current use of analytics, especially by those that perpetuate to be experts, is nothing but a newer accepted justification for what you were already going to do or where already thinking. Every new misunderstanding of moneyball, or of advanced statistical models, is a sales pitch designed to make you feel like you are making a much larger impact then you really are. This is not to say that analytics can not be powerful, only that the way that data is abused by the practitioners of the industry to propagate myths and bad practices is worse then worthless, it’s inefficient.

People have gotten so lost in their ability to collect information, the speed we can get feedback, and the need to justify their existence that they never take a moment to question what can you really get from pure analytics. Numbers have become the new shield by which we persuade others of our “greatness”, not to actually provide value, but instead using data to tell stories driven by ego and a want to be the one making the decision. We so want to target a group that we find one that stands out, or we so want to show our value that we tell someone they are doing something wrong, only to replace their “bad” decision with an equally biased one, taking credit for any result that comes from this use resources. In the rare best case scenario with analytics, you are left with probabilities and no clear direction, in the worst and most common cases, we are left with biased “insights” powered by everything but data.

In reality, we are no longer trapped by this use of data, like so many other industries before, because of our ability to interact directly and in an efficient and speedy manner. The data loses all value when we force a path on it or we forget what it can and is really telling us. Because we are a new field, mostly manned by people without real practical data discipline, we allow our own lack of understanding of the nature of data to allow our own biases to paint a picture that does not exist. There are hundreds of agencies, groups, and people who claim to have the newest way to repeat the same types of “analysis” without any newer insight into the value of that data. There are always new ways to corrupt statistics or different analysis techniques that are used to push an agenda, not to actually provide real value. We are so busy trying to run full speed down a path that we miss some really important and fundamental facts. Using only correlative data, we have no way to know the cost to change, the real value that anything by itself provides, or the actual scale of impact of any future change.

If efficiency is a measure of outcome over cost, then we have no way to have any insight into any piece of that equation. All the analysis in the world can not overcome the limitation of a one directional limited data set from a constantly changing and imperfect ecosystem. We find something that sticks out in the data, and then pretend that this is the thing that is more valuable than all the other pieces of data, simply because we can “identify” it, despite the fact that we have no idea of the value of that change nor what some other undiscovered optimization would bring. How does knowing that people from search spend half as much time as people come to your site in any way tell you the cost to change their behavior? Do not confuse your ability to derive value and efficiency with your ability to “discover” something in analytics. That you can even change that behavior? Or the relative scale of impact compared to other feasible alternatives? How is the anomaly any more efficient than the thing that looks like everything else? What do you really know from just identifying something from correlative data?

Why do we accept that limitation and why do we not try to give the context necessary to better answer those questions? Why do we perpetuate the myth of only passive data acquisition as a means to answer so many of the questions that we pretend to be able to answer today? Why do we pretend we can start with this magical data set and somehow arrive at the best answer? We are forced to use conjecture to make assumptions and then pat ourselves on the back when we get a result. We decide on what we are going to do analysis on, find a single answer, and then defend it because it is backed by data. Is that result a good result? If I have a 100 possible positive outcomes, and I get the 2nd worst one, who would tell me that is a good thing? Yet when we do not account for that context of our answer, we are constantly shouting our accomplishments from the hilltop. Do we congratulate the outcome we got or the 98 that we missed? If scored 2% on any test, you would think you failed miserably, but yet we hide this truth from ourselves to make sure that we all feel like we got an A. The truth is that we will never know any of the important contextual information we seek from correlative data alone.

Let me propose that testing, as a creator of causal data in a controlled setting is the only way to actually achieve all those value propositions that you have been promised. That causal data, the seeking and creation of it and the use of it as a transformative agent to power that earlier data collection, to move past so many of the limitations of online data collection, is the only true way to answer these important questions. That by “powering” your analytics, being willing to look past the myths and bad practices, and by breaking down what you really know from your data is the point where myth become real and where you can truly and dramatically impact your businesses bottom line. This is why machine learning is such a big deal, why we move towards optimization algorithms, and why it is so vital that you understand the value propositions of your various types of data. All of those methods leverage casual information as a building block to grow and learn. There is a better way, but it requires you to be humble and disciplined to reach that “nirvana”.

The core problem with analytics is that you are limited to linear correlative data. No matter how pretty a model and how much statistics you apply, you will never know the value of an action, nor will you know the efficiency to change it. We are trapped because the passive nature of the data you are trying to use only looks one way (towards the past) and has no way of accounting for feasible alternatives, or even the null assumption. You are stuck in the land of rates of action; you have 2.8% CTR on your other products model on your product page, but is that good or bad? Even if that is much higher or lower, how do you know that acting on it is any better than acting on the thing that looks just like all the others? If you removed it, where do those clicks go and is that more valuable than what is there now? Does increasing it help or hurt, or more importantly, what happens if it is not there, or what is the cost of changing it as opposed to the cost to change another module? Are people who purchase more likely to sign up for newsletters, or is it the other way around? All of those questions can be answered directly and efficiently through testing, and once we have created a number of interactions, we can start to see patterns from those causal relationships. We have the power with very little effort to start to really see the impact of changes, not just try and extrapolate them blindly.

What if you instead ignore all of that data in its passive form, and instead look for the active interaction of data to inform those decisions? What if instead of starting with correlative data, we ignore it until we have the context to make it valuable? What if we use the causal relations with an eye towards efficiency. What if you viewed data as an active measure, one that gains more value the more you eliminate unnecessary waste in the system, and one that only takes hold once you are disciplined in how you think about, what you measure, and how you actively change it? What if we stop allowing our biases and misconceptions of data dictate the start of our analysis, and instead allow the data to truly tell us what matters? What if you start measuring the value of your correlative data by its interaction with the casual data to allow for a much deeper connection to efficiency. What if you start looking for the value of an action, not the rate of an action? Testing is your active arm, to change all of that correlative data into causal data, if you are willing to go down that path.

This is the opposite of the myth of using analytics to power testing, but instead forcing yourself to accept that correlative data, with all the limitations that are inherent in online analytics, is not enough to make meaningful decisions. This is not about using testing as a means to prove one point right, but as a means to understand and value alternatives against each other. Changing correlative data into causal data presents you with information that is truly actionable and that truly gives you insight into the outcomes, value, and costs that we pretend we already have the answers for. This is the last step of the evolution of looking for the best answer and of stopping biases from leading you astray.

The challenge is that you cannot just take one test, or any single data point and pretend you have meaningful inference. Just as you can not pretend to know the direction of a correlation or the value of something from its rate of action, you can not just pretend to answer everything from a single test result. Diving through all that analytics data from a single test result is a dead end that leads to the same problem that plagues most uses of analytics. You have to be disciplined and can only reach this point after you have run a full series of tests. Think in terms of using this data to increase the efficiency of the system. You get real value only when you apply testing to power your analytics. We can measure the value of the items on the page, their very existence, and the costs to change them. We can quickly get tests live on multiple page types and measure the relative value. We can run a series of a tests on a page, and induce changes that allow us to see what segments are exploitable, or even what the influence is of various parts of a user experience are to those segments. If we are disciplined, we learn, and we never stop, then we can induce answers to achieve a positive result, while also answering those great unknowns that are ignored by analytics alone.

To make this even better, the act of acquiring the data also comes with the benefit of meaningful lift and improvement to your business. There is no zero sum game of only acquiring data or of getting lift, instead using testing to power your analytics allows you to meet the needs of change and growth while giving you all the promised panacea that so many claim analytics is providing by itself. It allows you to truly think in terms of efficiency and to be able to know the value of the different feasible options before you. It requires you to change completely how you think about analytics, to look at as part of a larger ecosystem by which you are informing the data, and then using that data to inform future action. It is not just pretending that the data is informed and then blindly using it to prescribe action. If you instead act to create casual information, use that to filter your correlative data, and do this with discipline, you can actually get those answers that we pretend we have today.

The sad truth is that most people who are in testing come from an analytics background. Just as many old school marketers struggle to stay current in the face of change, so too do many data “experts” who give new names to the same misguided techniques. They view everything through the analytics lens, and as such this makes them want to try and justify their analytics via testing, and to apply the same problematic disciplines to testing in order to bring it in line with current efforts. They so want to justify what they have done that they ignore its fundamental weakness and try to force new disciplines to conform to what they are doing. This leads to an entire marketplace full of people stuck trying to justify their existence, but very few willing to challenge its entire value proposition. I challenge you to avoid that black hole, be willing to challenge your own worldview and your own core beliefs about data, and to instead look at how you can best get and acquire meaningful data and how best to leverage it outside of what you are comfortable with. Very few people try and look at testing as its own discipline, or even better to see how that discipline can impact and change how you view other actions. There is a giant fishbowl of people who are in a race to the bottom justifying and preaching analytics as a feeding system for testing. I challenge you to be better than the current environment.

Let me instead suggest that you will only achieve real value if you flip that system, challenge yourself to think outside of that box, and to power your analytics via your testing. Testing is just one skill of many, but it deserves its own place at the table, not one that is a filter by which you justify other actions.

Conclusion –

The goal of these posts is to introduce new ways of thinking and to challenge your current mindset. I have shown the evolution from the most fundamental skill to paradigms that challenge your entire data worldview. It is only by changing what we do that we grow, and it is only by challenging our own core assumptions about what works that we are able to really make the dramatic impact to the bottom line that we all claim to want to achieve. You can not just accept that everything you hold true today will be the same in the future, nor can you expect to get improve if you refuse to change your own behaviors.

The reality is that there is no such thing as “right”; in the entirety of human history we continue to find better answers to all our questions. What I am proposing is allowing these new ways of thinking to interact with what you are doing and to see if you can then find a newer “righter” answer that brings your program to a whole new level. It is only through changing our fundamental building blocks of what we do that we achieve the scale and impact that we want to achieve. Change who you are, what you think, and let in other ways of thinking and try to be better than the water you are swimming in. Be willing to leave your current lake and find the diverse ocean of disciplines and ideas that are out there, and you will always be growing and getting better at what you do.

To navigate the entire testing series:
Testing 101 / Testing 202 / Testing 303 – Part 1 / Testing 303 – Part 2

TL;DR

Testing 303 – Advanced Optimization Paradigms – Part 2

Join the Discussion Cancel reply

Share this:

Related

Join the Discussion Cancel reply