One of the great ironies of our industry is all the time wasted talking about “Big Data” or “Governance” or a thousand other small wasted catch phrases. All of these are simple attempts to show a “maturation” or growth of the online data world. While there is enough backlash to point out that none of this is really new, it is simply the online matching the offline Business Intelligence path. A world that never has produced 1/50th of the value it also pretends to. The irony lies not in what is being debated or not debated, or if online and offline are the same, but in the fact that we all talk around the real issue and do nothing to address it.
The reality is that most BI work is a wasted effort, designed to facilitate a predetermined wish of some executive, and that the people rewarded are never the ones that actually provide the best analysis, but the ones that produce the analysis that best helps make others agenda’s move forward. The online world is simply following suit, to the point that we no longer look at how many reports you can create as justification for existence, but now how can you create a fancy graphic or data to support one groups agenda or another. This evolution follows a normal path from creation and storage, access, and now delivery of data, without one time dealing with the real issues at hand. Human beings are both awful at understanding or leveraging data, but also most people (especially those in marketing) are awful at their jobs.
If you think about it, it’s not that shocking that marketers are horrendous at their jobs; they make a living telling stories and trying to convince others of things that have no basis in reality. This means to exist in this world, you are left with two options: Act like a sociopath, or unconsciously acquiesce to some sick permanent version of the prisoner’s dilemma, but in this version as long as no one points out how full of it the person speaking is, they will return the favor. Both diseases leave the same outcome, a group of people who exist to propagate the work of the same group of people, and who seek outside justification, be it awards (from the same group), data that is only searched for one way to support them, or case studies of people who did the same thing and likewise lied their way to “success”.
The scare and threat of data is that when used in a rational manner, it can shed light on the real value of the day to day actions we hang our hat on. We can find out just how horrendously off our preconceptions are. One of the great ways to succeed at testing is simply to bet against people, as people are so rarely right that just picking the other direction creates a winning streak that would allow you to live as a billionaire if you could translate it to Vegas. You will quickly find that most experts are nothing more then storytellers, and that most of the largest gains companies made are often the least publicized, but those that are shared are often subconscious attempts to get others to fall pray to the same mistake that they too wasted months on. With almost no effort you can prove that most actions taken provide no value whatsoever to the company, or are so inefficient that they are far worse. The data and evidence is easy to get, but we avoid it in order to cope within this world.
Why can’t people act rationally more often? Why is data accepted and abused, why do we seek confirmation not information? Why do we not worship those that get results instead of those that tell stories? The answer is simple, we fear most that which may hurt our own world view. If everyone was willing to search for the right answer, we would all be better off, but as soon as one weak person accepts the word of one sociopath, we are all set on down this path, or suffer silently the fight against the tide.
This is not a new problem, Kant, Engels and many others have been talking about this problem for hundreds of years; we just find new names for the same human weakness. We seek out people to do attribution, and then believe it tells you anything about generation. We seek out those that confirm our hypothesis, not those that disprove them, despite the fact that the disproven hypothesis inherently has a better outcome. We want people to speak at conferences and point out why everyone else is screwed up but why we can change and do the exact same things that they were railing on, but now with a new name. We want to find a site that tells us “which test won”, not one that helps us be better tests or to achieve any value whatsoever. We are constantly searching for the next affirmation to justify who we are, not improve who we are.
“Reality” is not a kind mistress for those that are even slightly interested in it. Empirical Realism is looked at and talked about, but practiced by so few that it is almost as meaningless a buzzword as “personalization”. While it helps companies, it rarely helps those that which to exist in a corporate environment (see prisoner’s dilemma). We are forced to make a Sophie’s choice, do what makes others happy or that which helps our company. We all try and find ways to convince ourselves and others that we are not faced with this choice, yet we only succeed when we stop caring about or thinking about anything other then our own gain. In order to facilitate this, we find every way possible to make the mental pain go away and to find others that will tell us it will all be ok.
So if this is the sandbox in which we play, is it any wonder that our “heroes” are those that best project the best ways to make others believe what is being done matters? We worship at the altar of Avinash, or Peterson, or Eisenberg’s or anyone else we can find that as justification for what we were already doing. We have no way of knowing if what they say is correct, and personal experience shows that following most of that advice leads to immensely fallible results. Far be it from an inquisitive mind to question if the current action is the right one, or if there is a better way to think about and tackle problems. We instead allow others to dictate to us, so that we can avoid cognitive dissonance and rest easy at night… ok, on second thought, most marketers are both suffering from living in a prisoner’s dilemma and are also sociopaths. The data shows that these are not mutually exclusive but complimentary. Glad we got that squared away…
If you want to really make a difference, if you are tired of this same old world or of those charlatans who propagate it, are you prepared to fight the tide? Are you able to evaluate your own work, to go past the comfort and to find out how wrong you are, in just about everything you do? Are you then able to get past that mental scarring to do the same for others? Will you back down the first time someone pushes back, or will you make it your quest to do the right thing when it is neither profitable or easy to do so?
The history of business shows that rarely if ever does this problem truly go away, or does the better answer win. While history is written by the winners to justify their existence, randomness and the trampling of others are bred into every page of this twisted form of storytelling. And yet, until we deal with this real problem, until we are more interested in doing the right thing then the easy one, what will really change?
We will continue to waste time and effort on data, in order to justify wasted time and effort in most other efforts. We will continue to seek new words for old problems, and we will continue to make heroes those that most hold us back. Until we stop propagating the lie and until we look at ourselves first, how can we ever really deal with the real problem of data. Not the collection, not the sharing, not the presentation, but the people who are wired to use that data in the least efficient and most self-serving way possible. You want to solve big data, you want to change the industry, stop wasting time on tag management or Hadoop, solve the people, since there is where all problems lie. Don’t solve how do you share your point, but how do they think about and are they rationally using data to find an answer, or only to justify one?
Since this time of year everyone puts out their yearly recaps, I thought it would be interesting to look back at some of the larger bits of news or changes in my industry over the past 12 months:
1) Tag Management blows up… And then starts dying
At end of last year and the start of this year, there was massive news about all sorts of new players pushing heavily into the tag manager space. One of my personal favorites was ensighten, but in general, they were all focused on trying to make it easier for companies to get their analytics code out across their site. Considering how little value most companies actually get from their tools, this was probably a good things as it would at the very least stop these companies from wasting quite as much in the way of resources.
Unfortunately for all the bit players in this space, the two largest analytics providers, Google and Adobiture, decided to release free tag management solutions. Making any tool a commodity in a saturated marketplace (especially one with questionable ROI) tends to be the death of any niche. This entire market niche does a great job of really kicking dirt on the grave way too early. It will be interesting to see where the next big push is as more and more companies become ripe for vultures to pick apart their perceived problems and to provide “solutions”.
2) Growth of the competition
Not that these companies started operation in 2012, but they certainly started making waves and really creating their own unique niche. The biggest players to join the mainstream were Optimizely, Visual Website Optimizer, and Monetate. With GWO finally dying (god was that an awful tool) these players have grown from carrion to trying to be real players in the mainstream. All of them offer some actually pretty cool features, from slick interfaces, to easy deployment, and full service rapid testing. Test&Target, a tool that I know a thing or two about, has certainly suffered from a massive failure to really innovate, and its newest direction does absolutely nothing to resolve that issue.
Test&Target still completely blows away the competition when it comes to things that actually provide value, like flexibility, visitor based metric system, segmentation and data usage. For the people out there who are going to run a meaningful testing program, there still isn’t real competition (but I wish there were), but for people who have no clue what they are doing, people who listen to Tim Ashe or the Eisenberg’s and their immediate first reaction is not “these people have absolutely no clue what they are talking about”, these tools do make it cheaper and easier to waste resources.
That being said, the play that the new players are making sure seems like they are tackling the lowest common denominator. From completely ignoring statistical relevance, pushing people to just follow through with their own basic biases, and trying to push how easy it is to get tests up instead of doing meaningful tests, these groups are doing far more harm to the marketplace then good. Unfortunately, the direction of all of these tools seems to be following the inventor’s dilemma, and instead of improving the market, all them seem hell bent on trying to race to the bottom. The only real hope is that they mature long enough for one or two of them to really become a functional tool and that to cause Adobe to really push their own tool to be meaningful in the optimization space.
3) Big Data continues to be a buzzword without definition
The second half of the year brought about a bunch of push back against the use of the term big data, most of which was petty arguing about what that word really means. The start-up marketplace has been over saturated with technology and tools to present data or combine it in, pushing past hadoop to many newer similar technologies. The irony of course is that we are still operating from a A) collect data, B) ?????, C) Profit business plan. Big Data seems to be just a word thrown around (like marketing) to make it sound like people have a clue what they are doing.
Having worked with so many different organizations, the one things stands out more then anything else is that there is very little knowledge about how to get value from data, but a thousand different ways to find data after the fact to validate someone’s agenda. The more data you collect, the more complicated the systems, the more this seems to be true. And to top it off, you have people who feast on this gap by providing flashy middle and topware, which makes fancy dashboards which provide zero value but make some executive feel powerful.
I do not expect this pattern to change anytime soon.
Ok, so a few predictions for 2013:
1) Buzzword bingo will never go away –
I think we finally reached a critical mass where most people laugh at social (I hope), but that doesn’t mean that the buzzwords will go away. Personalization will hopefully start getting more push back by the middle of year, and will be replaced by newer buzzwords. It seems like native advertising is the current “gem”, but I expect once people figure out that it is just a new name for the same tired BS, that they will move on to grander more interesting words. My guess based on the actions of Adobe and IBM, is that suite and digital marketing collaboration will come back in a big way, but no matter what the word is, the instant it becomes a big deal you will find all sorts of people popping out of the woodworks talking about how they have always been an expert in this subject and that will be happy to provide the one thing you have to do to be successful.
2) By the end of the year, at least 1 of the companies in the testing space will die/merge
Like all industries, you see an explosion of want to be technology start to emerge to take on a clear leader, and eventually that technology dies and becomes no longer relevant, while a few grow/emerge enough to actually be a legitimate contender to the title. This space will all of the flash and zero substance is ripe for this entire scenario, the only question is will it be by the end of 2013 or the middle of 2014.
If I had to guess which tool was most likely to join the ranks of GWO, Vertster and Optimost, I am going to go with Monetate. Besides all the massive limitations with that tool (and the god awful statistics), it seems to be caught in the middle between a much better but higher priced competitor, and a lower priced but just as good and easier to use lower end of the market (optimizely, VWO).
If I were going to guess which tool is most likely to mature meaningfully, I am going to go with Optimizely, just because of the flash. If they ever get someone who actually understands testing and is not just BSing their way with moronic tales about the first Obama campaign, then they can mature enough to really become a major player. There are features of that tool that are ahead of the market (though the actual value of those pieces is questionable at best). That being said, they will most likely continue their carrion approach of “ease of use” and fast testing instead of meaningful or relevant testing.
I sincerely hope they do mature however, as the industry seriously needs meaningful thought leading competition instead of what we currently have.
So there you have it. Happy 2013 to all.
As I see one again one of the surest signs that you don’t know what you are doing in testing, I am forced to clarify one of the greatest misconceptions when it comes to testing.
The first rule of any statistical analysis is that the data must be representative. It doesn’t matter how statistically accurate your count of blue cars is if you are trying to measure the impact to the entire freeway…
So why then do people think tests should only be to “new” users. This is one of the most consistent misunderstandings and confusions of cause and effect. People who have been to your site have come there for a reason and are coming back for a reason. Just because they saw the old site, it does not mean they are not fundamentally important to your new analysis. They are not repeat users just because of the new site (causation), they are people who have declared an intent and are researching or repeat interactiors with your brand and products. They represent anyone who has previously wanted or been interested in what you are selling who have happened to have been on your site before (correlation).
This means that while there is some possible interaction with seeing an old and new experience, that ignoring them is you telling me that you do not care and are not interested in the revenue generated by anyone who has ever come to your site or purchased from you before or who has thought about purchasing from you before. There will be some interaction from the change in experience (if they even remember) that is spread evenly over each sample, but that is especially mitigated in a visitor based analysis (to measure performance over time). What is not accounted for when you do not allow them into your test however is 100% of all people who have given you revenue before!
That means that if you have any business where you would like people to repeat use or purchase from you, that not including them invalidates all your data, since your data set is both biased (people who have never been interested in you before) and not representative of your long term population.
To put another way to match this time of year, this is exactly the same of only polling fox news watchers to represent all of America for the presidential election. Or MSNBC watchers to vote for all social bills. It is a population, you will get a result, and if you really want to ignore reality, you can go with the data, but it in no way represents the entire population or in any way tells you what matters to the entire population.
There is nothing less important then what the winning recipe was of a test.
I want to let that sink in.
Everyone loves to get caught up on which recipe won, because it is what you look at and it is what others want to know, but as a tester, it is the way that you arrive a that you arrive at that answer that determines if you actually provide value or just an answer. Individual outcomes interest people who have something invested in being “right” where consistent meaningful discipline is what matters for people who are invested in improving things consistently. If you only discovered something that is the 2nd best out of 10 different feasible alternatives, you wouldn’t pick the 2nd best, but when you only compare two things, that is most like what you are doing. You haven’t accomplished anything and you are actually losing money. If you didn’t actually measure outcomes of multiple alternatives, or if you didn’t measure against a global site wide metric, or if you did not account for the cost to arrive at that conclusion, then you are fooling yourself into thinking you have accomplished something when all you did was take resources from others to make yourself look good. It may impress others, but it has not provided one bit of value to the organization.
In order to be the best alternative, you need context of the site, the resources, the upkeep and the measure of effectiveness against each other. Even is something is better, without insight into what other alternatives would do it is simply replicating the worst biases that plague the human mind. Figuring out the better of two options is an answer, finding out the value of different feasible alternatives is providing value. Finding out who was right “picking the winner” is great for people’s ego, but making sure you are measuring multiple alternatives and that you are choosing the options that provide the highest return to the largest population for the lowest cost is what makes you successful.
To make it worse, people then look at the results and think that they will get the same result for their site, and in the worst case, they do. Sites like whichtestwon, which focus on letting people find out what won amongst two options sound great, and capture people’s attention. They let you guess and pat yourself on the back when you are right or wrong, but the reality is that they are designed to feel good but not actually provide value. If you wanted a site like that to provide value, then they would require
The problems of a tester are two fold, one in convincing others to test, and second in improving the testing to make sure that you are maximizing return and lowering cost. A good tester needs to be able to balance both, since there is little to gain outside of personal reward in just foolishly running tests. But sites like whichtestwon? It is designed to assist the first; to provide evidence for people that you can get an positive outcome (missing that you also get outcomes from other uses of the same resources) without actually giving any real insight into if you did provide a positive outcome (an outcome, by itself, tells you nothing). It is designed exclusively for people to abuse to push their own agenda. To take a quote directly from their tour:
Site shows stats from various A/B tests – Finally I’ve got evidence to show clients on a load of design decisions!”
That shows everything that is wrong. Testing should be about seeing what the value of different test variants are, not making the case for a specific one that you want. In order to be successful, you have to prove yourself wrong. If it would have worked the first time, then there was no point in the test (and you are wasting resources to run the test) and you have learned nothing. You should not be given “credit” when you are adding additional cost and providing nothing more then validation for others. When you are wrong, when you have tested what you want and tested other alternatives, and you find other alternatives prove to be more efficient, even if what you wanted was better than control, that is the moment you are truly gaining something from your testing efforts.
There is a plague of people in our industry who try everything that can to show how much value they got from a single test. Who view testing as a way to get what they want up on the site over the HiPPo or someone else. Who abuse testing to push their agenda and who then take credit when they find something that proves better then what was there before. The act of running a test is not a measure of success, nor is having an outcome. Added value only comes from finding an outcome that is different then what you would have already done. In order to do that, you must measure multiple feasible alternatives and find an outcome different then what people want. If you aren’t able to do so, then the most fundamental problem you have is you, and how you think about testing. If you are able to, then the individual outcome, what won, is far less important than how you got there and what you chose not to do. The measure of a testing program is how often they are proving people wrong, and about how consistently you can do that with the least amount of resources possible.
Being a good tester means that you always know the relative costs. It means that you know how often something works, not just if it did one time. To be good, you should be able to create meaningful actionable lift on all your tests, not jump up for joy and promote yourself to the world when you managed to find one thing better on 1 out of 5 tests. Don’t settle for taking the easy road and trying to take credit. Add value, be better, learn how to look at things and you will actually create value, today and always. If you go down that road however, then no one cares which variant won, it has no bearing on long term success. Great, you found the thing to push from this campaign, that is just one small step on a long road of continuous action. You wouldn’t reward someone because they managed to turn write their name on a test, so please do not think that whichtestwon somehow does anything to inform you how to be a better tester.
If you really wanted to see a site like whichtestwon matter, then show the variants that didn’t win. Show multiple options for each outcome and show what the best option was? Give us a measure of the cost and give us the internal roadblocks that you had to overcome. Let us know if that outcome was greater or worse then others for that group and what they are doing with the results to get a better more efficient result next time. If you are interested in anything more than self-promotion, post the things that don’t work. Tell us how often something wins, not the one time it did win. Use the site to find examples of where you were wrong and inform yourself that you are not right… ever. The most we can ever hope to be is a little less wrong and working on a way to speed up the process for discovering just how wrong we are.