Right before the end of last year, I wandered sideways into the huge and complex issue of web analytics process. I got there inadvertently, first posting some thoughts on why I’m reluctant to look at my blog readership numbers and then, in response to some questions I got, tackling the issue in a more logical fashion. In essence, I proposed that one of the most important factors in designing good web analytics processes is having a theory of error – why things in web analytics go wrong. And I laid out six of the most common causes of error in web analytics:
1. Self-Interested Measurement
2. Lack of Statistical Significance
3. Unreliable data and what to do about it
4. Siloed Optimization
5. Metric Monomania
6. Tactical Focus
I promised, this year, to lay out in more detail how appropriate processes can be designed to protect against these problems. And I fully intend to do that. Just not today.
As I thought about this list, I felt that I had left out at least two significant problems that really should be included.
The two additional causes of error that I had in mind are problems around misinterpreting self-selection and the influence of navigational structure. Both are problems that a good web measurement process really should take note of. Both are bound up with ‘Lack of Statistical Significance’ since they are technical problems in measurement. However, neither is related specifically to statistical significance and the issues around controlling for randomness and variation. I decided to call them out separately because, after thinking about them for a bit, I believe they drive to different aspects of good process.
Self-selection occurs all the time on the web. In fact, it isn’t a problem at all in the context of web design. It’s actually a desirable outcome. We nearly always want people to self-select into appropriate channels. But for measurement, the fact of self-selection makes understanding differences in page or tool performance incredibly challenging.
I talked about the issue of self-selection as it relates to online survey opinion research and satisfaction scores in a recent post. Here’s a short excerpt:
“There is no way to determine from the basic facts:
Comment users have a higher sat score than non-comment users (attitudinal)
Comment users consume more pages than non-comment users (behavioral)
if either relationship is causal. We don’t know if commenting self-selects visitors who happen to be more satisfied and consume more content or whether it actually contributes to that relationship.”
In other words, people who use comment functionality may already be more engaged and have higher satisfaction than those who do not bother. If so, the apparent (and statistically valid) relationship between using comment functionality and satisfaction is non-causal – at least in the direction we are hoping for. Comments are not driving satisfaction, they are being driven by it.
This is an incredibly common source of error. I think it’s fair to say that a simple majority of all uses of VOC (Voice of Customer) data that I see are nothing more than interpretive errors caused by self-selection. And, of course, the problem of self-selection isn’t limited to VOC data. It is rampant in behavioral analysis as well.
Self-selection errors may be rampant in web analytics, but they come from a class of errors that are extremely well understood in broader analytic circles because they happen to be common everywhere. This, at least, tends to make them easy to explain to both analysts and product marketers.
The other problem I wanted to add to my list – the influence of navigational structure – is not unheard of in other forms of analysis but it is less common in marketing analytics and is sometimes more insidious. Like self-selection, navigational structure is a good thing when it comes to web design but a problem when it comes to measurement. In many kinds of web analytics, we are trying to determine the impact of viewing one piece of content (page or tool) on some downstream decision (registration or purchase).
If we find that one piece of content is highly correlated with the downstream behavior, we tend to think it is working well.
Unfortunately for ease of measurement, web sites are not random presentations. They are more like a magician’s card trick where the design team is trying to “force” the user to pick the ace of spades. Because of this, a high-correlation between pieces of content is nearly always indicative of nothing more than navigational structure.
Thanks to the heavy influence of web site structure on behavior, a goodly percentage of web analytics research accomplishes nothing more than an abstract mapping of the navigational structure of the web site!
In a way, navigational structure and self-selection are two-sides of the same coin. In measurement, we are always trying to sort out the real impact of content on behavior. But we need to be constantly aware that our analysis is not based on a controlled experiment. On the one hand, we are looking at the behavior of a population that is far from random and that has strong pre-dispositions to certain behaviors. On the other side, the tool they are using – the web site – was designed to discourage or limit certain behaviors while making others nearly unavoidable.
Unless you are careful (and that’s part of what good process is all about), you can find yourself in a “Heads you lose, tails I win” situation when it comes to web measurement.
So now that I’ve added two more problems to my list, I’m going to tackle what I think are appropriate process responses in my next post(s). It's a daunting task.
So naturally, I'm also thinking ahead. I was thinking of doing a short series on the Omniture Reporting API after the process posts. API stands for application programming interface – and in this case, it’s a toolkit for accessing the SiteCatalyst reports from other software tools. There are a lot of cases where using an API is preferable to using a canned application like SiteCatalyst or even the Excel integration. I thought I’d talk a little bit about when to consider using an API and provide some guidance on using Omniture’s version.
Re self-selection:
Just heard of an internal client study (no links or sources were provided) that showed people who read product reviews were already more likely to buy than those who did not read reviews - hence some or most of the "lift" in conversion attirbuted to product reviews may be as a result of the pre-disposition to buy of people who read reviews.
Agreed on the VOC comment, much of this area suffers from "who cried loudest" bias and not any scientific sampling of populations. Suggest people set a threshold, e.g. "unless at least 10 customers say it's a problem, it's not a problem *with us*". Not to say you don't solve the customer's problem, but at same time don' go looking to change policy / procedure to address complaints below threshold.
Posted by: Jim Novo | January 22, 2009 at 09:44 AM