There were at least four Huddles on testing at X Change and the one I went to was a very well attended (full) Huddle on Segmented Testing in which a majority of the participants were engaged in fairly serious online testing. This combination of interest and real participation are clear evidence of the increasing maturity of this discipline.
But as I listened to the conversation, it crystallized a frustration I’ve been feeling for some time – frustration that there seems to be a fundamental disconnect in the relationship between Web analytics and testing.
It’s a disconnect that is surprisingly common on the ground at our enterprise clients – where the analytics groups are often no more than peripherally involved in testing efforts. It’s a disconnect that is shockingly common at the vendor level – even where the same vendor (as is often the case these days) is selling both products. It’s a disconnect that is encouraged by an approach to testing recommended by many vendors – and approach that I believe is fundamentally misguided. And it’s a disconnect that undermines the basic purpose of Web analytics.
After I listened to the conversation at X Change I went back and read some of the literature that vendor’s produce around testing. The basic methodology that seems to be most commonly recommended is something like this:
1. Creative Development: Build an alternative (or MV set) of creatives for a site page/experience
2. Run a General Test against all visitors
3. Identify the winners
4. From the best alternatives, analyze and test by segment to identify segmented winners
5. Target these creatives by segment on a going-forward basis
6. Repeat from Step #1
This process can be summed up as Create – General Test – Segment – Target.
It sounds great, but I don’t think it makes any sense. In fact, I think it encourages people to test in exactly the wrong way.
The fundamental problem is that on this model, segmentation comes after creative development and creative development is left hanging, unexplained, in step 1.
The single most important questions I think people should ask themselves when they begin testing are: what are you going to test and why are you testing it?
I see nothing in this methodology that can even begin to answer these two questions.
What’s more, the only likely answers to these question that will make sense in 99% of all cases are tests targeted to specific segments not the run-of-site visitor tests that vendors recommend.
How can anyone build a creative alternative without a segment and a visit intent in mind? And if you build a creative with a segment in mind, why would you ever do a run-of-site visitor test on it?
It’s just this simple: there is no way to build plausible creative alternatives except with respect to particular segments and use-cases.
Think about it. This process suggests that you build multiple creatives without focusing on your audience. How can this be done? Throwing random words at a wall?
Then you test these general creatives in random combinations against every site visitor. Then you try to unearth the segments that match the winning combinations.
It has got to be the most backward, impossible view of how to test imaginable.
Let me recast this process in way that I think is vastly more intuitive and more effective:
1. Create a visitor segmentation
2. Create a set of key site use-cases
3. Create a two-tiered Visitor Segment/Visit Type Use-Case Analysis
4. Target Segment-Use Case areas where analysis suggests there are significant problems and opportunities
5. Design your creative to target the particular Segment/Use-Case you want to improve
6. Test on that segment
7. Repeat from Step #4
In this method, almost every test you run will START as a segmented test. Why? Because starting with a segmented use case is the only practical way to understand who and what you are testing and develop a plausible creative alternative.
Indeed, steps #1-#3 are exactly the steps we take when we create a Behavioral Use Case Analysis. It’s one of the core analytic products we do (and part of what got me thinking along these lines is that I had just given a Think Tank class on Use-Case Analysis prior to X Change) and, the more I think about it, the more I think it’s the natural vehicle for developing a comprehensive, coherent, directed testing plan. A plan that has a clear reason for every test and every segmentation that you try.
I won’t claim that there are never cases where you might plausibly develop a creative for ALL visitors and then look for segments that perform better or worse. Every once in a great while, that strategy might be appropriate. But I will claim that it’s much rarer than the reverse. I also believe that the segments people think about when they try to reverse engineer segments from a test are almost never the right ones.
When we build the two-tiered segmentation (Visitor/Visit Type) for Use-Case Analysis, the Visitor Level segments are probably the one’s you would expect. They are business-based segments designed to capture the type and value of the user. But the visit-type segments are more complex. They are designed to capture the visit intent for the visitor based on their actual behaviors.
To do this, we use behavioral signatures that take advantage of how people arrived, what prior content they’ve looked at, what visitor segment they are in, what they click on first, what they search on, etc. to decide what type of visit they are engaged in. We almost always use a combination of multiple variables designed to identify a particular kind of visit.
Using these behavioral signatures in conjunction with visitor types yields a set of testable segments that actually mean something. These Visitor Type/Visit Intent segments are much different than the sort of random variables (DMA / Source Channel / New vs. Repeat) that seem to dominate when people begin with a run of site creative test. They are also almost impossible to reconstruct unless you approach them from the perspective of identifying specific use-cases.
As I look at and listen to the types of tests that I see organizations doing, it seems to me that they suffer from a number of sins nearly all of which are related to buying into the process of run-of-site testing followed by segmentation.
Most programs I see lack of a comprehensive test plan and a clear direction of where they are heading; they tend to test minor creative alternatives not deep business strategies; they over-rely on multivariate testing; they generally produce little differentiation of segments in actual targeting (since the creative is inherently generic); and, finally, they show little consistency in significant segments by test over-time (since segments are derived in an essentially random fashion).
I believe this myriad of sins is driven by the basic disconnect between Web analytics and Testing – a disconnect driven by a methodology for testing that gives no real weight to prior analysis and tries to back-fit segmentation onto run-of-site visitor tests.
There has got to be a better way of doing testing. I should re-state that: there is a better way to do testing. If you’re Web analytics program isn’t creating a rich framework of Segment / Visit Types that could drive a segmented testing plan, you need to re-think your Web analytics. And if your testing program isn’t driven by a pre-existing analytically driven model of visitors and visit types, then you need to re-think your testing program.
I’ve thrown out quite a few potentially controversial claims in this post. So I’m going to save my discussion of multivariate testing for next time – when I’ll explain why, if you accept the critique given here – you may decide that multivariate testing isn’t just more complex and more work than A/B testing, it’s also quite likely a bad idea.
Gary, agree completely.
"Brute force" testing, where one keeps throwing stuff at the wall to see what sticks, seems like an incredible waste of precious resources. And too often no actual learning takes place, people don't stop to ask "why" certain things work better than others. I hope your Use Case approach teaches people the value of taking the time to understand "why".
Test outcomes are not as random as people think; there are patterns that repeat over and over, and segmentation is the key first step to being able to recognize these patterns.
"Optimizing for the average" is truly suboptimal!
Posted by: Jim Novo | October 04, 2010 at 05:27 AM
Great post. We have been continously testing for the past 3 years and one of the first european clients of first Optimost and later Offermatica (now Adobe Test & Target), users of Google Optimizer and finally we decided to build our own testing tool integrated with our CMS.
Whilst often guilty of the stepped methodology critisised here, we wholeheartedly disagree with the vendors and agree with you, on the value of MVT over A/B testing. A/B tests are easy and flexible and can be launched in no time. Analysing the results are straightforward. Often when running MVT test one is tempted to test "because we can" and consequently add ever more superficial and marginally diferent creative elements into the mix.
The fundamental problem with most testing as a service offerings as you rightly point out is the lack of capability of testing deep business strategies that goes beyone simple changes in creatives.
Posted by: Peter - Serenata Flowers | October 05, 2010 at 07:56 AM
Excellent blog, cannot agree more than from experience very few companies are linking MV testing with analytics.
Taking this a step further, its very rare for usability testing to also be supported by analytics.
In both cases this is normally a case of ownership, where we have all these things controlled by a user experience director (rare but starting to happen) we see the best results
Posted by: Garry Lee | November 01, 2010 at 12:32 AM