It's hard to get analysis right. Even when you do, it's hard to get it consistently right. Good process is very much about protecting ourselves from the things that cause mistakes so that we have a chance to be consistently correct. In two previous posts, I’ve listed eight different common sources of error in web analytics:
1. Self-Interested Measurement: Finding what you expect in the data
2. Lack of Statistical Significance: Believing small variations or tiny samples to have much more significance than they really do.
3. Unreliable data and what to do about it: Trending bad data or “getting over” critical data quality issues
4. Siloed Optimization: Improving one channel at the expense of others
5. Metric Monomania: Over-reacting to changes in individual KPIs
6. Tactical Focus: Concentrating on micro-analysis and missing the important information.
7. Self-Selection: Reversing cause-and-effect and thinking related items are necessarily causal.
8. Navigation Structural Influences: Reversing cause-and-effect and evaluating content performance without factoring in the influence of site structure.
So now it’s finally time to tackle some possible organizational responses to each of these. There is no one single solution that encompasses them. In general, I’ll suggest a specific organizational structure, process or analytic tactic designed to combat the influence of each error factor. Add them up, and I’m hoping that they’ll form a significant part of an overall picture of a good web analytics process. Remember, I’m not suggesting that all process is driven by error theory – so I’m not going to trouble myself to elaborate a complete process design. But I am suggesting that ANY web analytics process design should include these components or risk significant and continued misuse of web measurement.
This is such a large problem in all measurement that there are many different aspects of a cure. From an organizational perspective, I think the single most important point is that your web measurement should always be quasi independent from your tactical implementation teams. This goal is complicated by the fact that measurement needs to be close to those teams or it will not understand what to measure and it will not be able to communicate results effectively. So the really tricky goal is integration without co-option. I think this is best accomplished by having measurement report up at a fairly senior level and outside the daily operational responsibilities. In other words, if measurement is being directed and consumed at the VP Marketing level, it shouldn’t report there but up at least one-level. I’d also have measurement report up to two different organizations – perhaps the C-Level Strategy as well as within Marketing. In this way, measurement is never entirely co-opted to the interests of one department. No matter what strategy you adopt, you’ll never avoid the problem of self-interested measurement. On the other hand, if you build in too many checks-and-balances, you’ll never get anything done. It’s a question of balance, obviously. But for many organizations that simply grow measurement underneath the most important consumers, it’s a question of serious imbalance.
There are also some essential analytic processes that an organization can commit to that will help prevent too much self-interested measurement. One of the simplest and most important is to demand that all projects specify specific measurement criteria for success before they are deployed. This is a powerful tool for organizational discipline. And it is critical that it be done before deployment. As I’ve said often, it’s always possible to find some evidence of success when you are allowed to choose the measure of success after the fact. By committing to the criteria for success beforehand, the possibilities for egregious self-interested measurement are greatly reduced.
A third, extremely common manifestation of self-interested measurement is in the surprising practice of allowing vendors to measure their own performance. Whether it is SEO, PPC or web design, the idea that your vendor should be setting the success goals or providing the measurement for their own performance is dreadfully wrong. It isn’t a question of honesty. We are all simply too keen to read the data as we wish to see it. Independent measurement is essential just as is independent software testing. It’s not because developers are out to lie about bugs in their software – they just can’t recognize them. Believe me, this is true for all of us.
Lack of Statistical Significance
In talking about this before, I mentioned that most information users end up considering professional statisticians a royal pain in the ass. But if they are sometimes more like snotty gatekeepers than useful information processors, that’s because there is a genuine and necessary role in keeping bad analysis at bay. Web Analytics departments need not be stocked to the gills with professional statisticians. One is probably enough. But if you have a team generating reporting and analysis on a regular basis, you need at least one gate-keeper reviewing it and quashing the most abusive practices.
In addition, it’s good policy to build measures of significance and variation directly into your reporting. It’s not enough to say that your analytics-team's conclusions are free of gross statistical error if your reports make it all too easy for information consumers to commit those same breaches. One of the benefits to Analytic Reporting – the direction I’ve been preaching for the last year – is that you can build measures of variation and significance into the reports. That’s a huge win since the people most likely to make basic mistakes in statistical analysis (our report consumers) are the ones who have traditionally gotten the least help and training.
Unreliable Data and What to Do About It
Web Analytics data is notoriously unreliable. Unfortunately, this has spawned two diametrically opposed and equally pernicious views. The first and worst is to “get over it” and just use the data. The second is to ignore the data altogether. Neither attitude is reasonable. In a couple of decades doing data analysis, I’ve never worked with a data set that seemed really clean. Data is almost always dirty in some respects. So if we weren’t willing to work with dirty data, we’d never get any analysis done at all. On the other hand, “getting over it” is no solution. And neither is the disastrous advice to “trend the data.”
Combating data quality issues is a never ending process and it takes real work. First, an organization should make sure that the core infrastructure for data collection is reliable and complete. This means careful governance of tagging implementation, good standards for variable encodings, and regular checks of the data quality and collection. If you aren’t doing this, you aren’t doing what you should to protect your analysts. Second, good data quality often benefits from multiple, diverse collection points. Combining panel data and VOC data and analytics data can often highlight places where each method is struggling. Evaluating log data vs. tag data can often pinpoint significant issues with each. Deploying Google Analytics along-side Omniture can help too.
These combined data sources can help guide the analyst – making it easier to understand the types of analysis that can be done with each tool. It’s quite likely that you simply cannot do a new visitor analysis with a web analytics tool. And no amount of trending is going to help. But you can do a functional analysis just fine. And perhaps there are other ways to handle the new visitor analysis.
Just as with the law, ignorance of your data is no excuse for missing data quality problems. The only way to protect against data quality issues is to provide the analyst with enough understanding of how the data is collected and what changes take place so that informed decisions about the data can be made. The necessity for this should influence not just a host of decisions around tag deployment but also your thinking around change-control and reporting.
A great deal of bad or wasted analysis is generated in organizations because no one bothered to tell the analyst that something had changed. Change control is a critical element of good measurement – and your analytics department needs to be tuned into both the technical and marketing sides of the house. From an organizational perspective, this usually means regular meetings and a well-defined flow of information about new pages, new tools, changes in infrastructure, and new campaigns from the originating departments to the measurement team.
I can see this going to get quite long. Well, that’s no surprise. I’ll pick up next week with siloed optimization and go on from there. Half-time at the Super Bowl should be more than long enough to knock out a post!