I'm teasing of course. I don't expect to raise anybody's salary with a discussion of quantitative variables (or anything else). But I do hope to explain one of the more important differences between web analytics and most of the traditional BI marketing analytics that has been done in the last twenty years. In addition, I'm going to make the case for why a new tool from WebTrends is a lot more important than you may be inclined to think. Along the way, you just might get a little richer  in understanding if not in takehome pay!
What actually got me to write about quantitative variables is a product from WebTrends that I first saw in late July and revisited at the Engage Conference in Vegas. The product is called Score and it's a product direction that I believe to be significant. One that echoes back to a great deal of work we did in the first six or seven years at Semphonic.
To explain why I think Score is significant, I’m going to have to have to delve into a bit of analytic theory. Don’t groan – I’ll do my best not to be too digressive!
For many years, multidimensional analysis has been the staple of business intelligence systems. Products from companies like Business Objects, Cognos and MicroStrategy have provided rich multidimensional reporting and analysis for probably a good decade. Web Analytics tools like Discover 2.0, Visual Site and WebTrends VI are just beginning to provide the similar levels of capability to web analytics.
What is multidimensional analysis? It’s simple stuff really and this is ground I've gone over before. In basic statistics, you typically start an analysis with a Frequency table. A Frequency table gives you the counts for all the values of a single variable. Here’s a example:
Gender  Count  Percent 
Male  1549  37% 
Female  2238  53% 
Unk  413  10% 
A frequency is a 1 dimensional analysis – it looks at a single variable. The next step up in complexity is called a crosstabulation. And crosstabulation typically begins with two variables. Here’s a classis 2way crosstabulation:

Gender 
 
Age 
Male 
Female 
Unk 
1625 
540 
400 
60 
2640 
525 
900 
75 
4165 
325 
780 
195 
65+ 
159 
158 
83 
CrossTabulation is 2 dimensional analysis – and the basic method can be infinitely extend into three, four, five and potentially even more dimensions. In three way analysis, we might add a variable like income and be able to see the count of all HighIncome, Age 1625, Males versus the count for all Low Income, Gender Unknown, Age 65+.
What’s happening when you do multidimensional analysis is, implicitly, visitor segmentation. Each cell in the ndimensional table can be reasonably considered a specific visitor segment. And by adding metrics around success or usage, you can map these descriptive variables to realworld differences in performance.
NDimensional analysis is powerful. But it also has some fundamental limitations that are poorly appreciated both in the BI world where it has been the dominant paradigm and in the web analytics world where it has looked like the holy grail.
When you use multidimensional analysis, you are segmenting visitors (or visits) into finer and finer units. Eventually, you might have a success count for an extremely small population defined by six or seven different factors. But as powerful as this is, there are some things it just can't do.
First, multidimensional analysis is like a series of implicit AND filters. A visitor must be 1835 AND male AND HighIncome AND located in California. But suppose you want to add an OR filter. Suppose, for example, that you want 1835 and MALE and (HighIncome OR Medium Income) AND located in California. You can do this (in a way), by adding up cell counts. But the multidimensionality works against you now, because the OR may add 500 (50 states x 5 age categories x 2 gender categories) cells to keep track of. That isn’t very practical. So here’s a key capability to look at when you evaluate a multidimensional reporting system – can you collapse some values in a dimension easily. Some systems let you do this –  but many don’t. It’s a subtle point but it makes a big difference in the real world.
Even more significant, however, is the difficulty that ANY multidimensional analysis system has with quantitative (continuous) variables. Classic multidimensional analysis evolved in the CPG world where demographic variables were almost always the dimensions. Demographics aren’t usually quantitative. You are either Male or Female, 1835 or 60+. It's true that variables like age and income COULD be treated as quantitative values, but they are almost always used as Category variables. These variables aren't treated as numeric variables where the value difference is signficant. From a marketers perspective, 2540 is just a category. and 26 is the same as 40 but different than 24.
This is a key fact about MultiDimensional analysis  it isn't particularly useful for anything except the analysis of variables as CATEGORY variables. Multidimensional analysis doesn't treat values as numerically significant.
In web analytics, however, the key dimensions are behavioral. And virtually every behavioral variable IS quantitative. We are commonly interested in HOW MUCH a visitor did: how many page views of Product Material, how many petitions they signed, how long they spent on site, how often they visited, how much product they purchased. In all of these cases, good analysis of the data requires numerical comparison of the variable values.
There are behavioral variables that aren’t quantitative (Is Customer, Is Registered) but they are much less common than quantitative variables. In web analytics, by far the most important behavior is page view. And page views are quantitative in every sense. They are numerically comparative and the number is ALWAYS significant.
Multidimensional analysis doesn’t handle these quantitative variables. You HAVE to bucket variables before you can analyze them  so you're reducing quantitative data to category data at the very outset. And if the variable isn’t bucketed, it isn’t available as a dimension. So suppose you want to analyze the effect of viewing Product X feature pages or spending time in the Product X feature area. In multidimensional analysis, you can’t do it unless you can bucket Product X feature page views or Product X feature time.
In many, many systems, you can not bucket these variables. That means you can’t do multidimensional analysis on them. And even when you can bucket them, you are limited to the ranges in the buckets. You aren't ever analyzing them numerically. And here’s where our AND issue strikes again, because if I want to evaluate visitors with High Pages OR High Content Time, I’m back to adding up cells. And the methods of multidimensional analysis don't give me anyway to take advantage of the fact that the actual value of a quantitative variable IS significant and can be usefully compared to other actual values.
Without the ability to flexibly bucket and collapse range variables, doing multidimensional analysis on quantitative variables is impossible. With them, it is extremely clumsy and not very useful. No matter how powerful the multidimensional model is, it’s really the wrong tool for the job.
So what’s the right tool? Well, that brings us back to my opening. Because the answer is something like Score.
Score lets you assign values to actions – and those values are additive. So I can produce a visitor score based on the number of product content views. Or the total time in a content area. Or on ANY combination of those two values that exceeds a threshold I set. What is virtually impossible to do with even the most flexible and powerful multidimensional analysis tool is trivial with a scoring system.
Before we began using offtheshelf software, scoring was the primary analytic technique we used at Semphonic. We didn’t do it by hand (the way Score has you do it) – we used neural networks to score visitors across dozens of different dimensions. But Score’s user driven approach is still vastly more powerful for a host of important web analytics tasks than mulitdimensional analysis.
What tasks are these? One of the obvious ones is measuring Engagement – which is why Score was discussed so prominently at Engage. You can’t measure Engagement with a single measure. And nearly every significant component of Engagement is quantitative  how much you do of something is CRITICAL. Which makes analysis of Engagement in multidimensional tools problematic.
But the utility of a product like Score is hardly limited to Engagement. For obvious reasons, scoring methodologies allow for significantly richer visitor segmentation than rules based around dimensional filtering.
CRM integration is a third – and particularly significant application for scoring methods. When I first saw Score, its potential uses for driving customer contact programs seemed obvious. Many of our clients do both regular and eventdriven email messaging based on site behavior. Using multidimensional filtering, these cuts are quite limited. Scoring makes this process both much simpler and much more powerful.
A travel site, for instance, could easily establish threshold values for receiving an email alert on a particular destination. That threshold might include ANY combination of actual trips to the destination, planned trips to the destination and actual trips to similar destinations. Trying to accomplish a similar filter with multidimensional cells is either impossible or tortuous.
Here’s an even an even more difficult problem – suppose I send my customers a oncemonthly newsletter and I want to target a dynamic offer to the destination they are MOST interested in. This is flatly impossible with multidimensional filtering. It is possible with scoring systems (though not – sadly – with Score which doesn’t currently support this).
Effectively, what scoring methods let the analyst accomplish is the analysis and combination of quantitative variables. In web analytics, this turns out to be an immensely useful capability because all of the core variables ARE quantitative.
Score is a ways from being perfect  it's Version 1.0 after all. You can’t use all the variables you should be able to. You can’t compare scores. You can’t use negative scores. There are limitations on the number of scores you can build. The rule building process isn’t terribly flexible when it comes to integrating with large amounts of content. There is no data driven scoring.
But despite these V1.0 weaknesses, Score is a very, very significant upgrade in capability compared to multidimensional analysis within web analytics. It's already a great tool for a number of key web analytics tasks. With continuous improvement, it has the potential to become one of the most important tools in web analytics. Once analysts start using tools like Score, they are going to realize something that should have been fairly obvious all along. Web Analytics is all about quantitative variables.
"From a marketers perspective, 2540 is just a category. and 26 is the same as 40 but different than 24."
Then Mr Marketeer is doing something wrong.
Posted by: Ralph Sparkle  October 22, 2007 at 01:37 AM
Gary,
Although I needed to reread several paragraphs to fully get to grips with the methodology and how I could be able apply it to personal examples/implementations took some time. The post, in the end especially for a rookie as myself, was a real eye opener.
In regards to negative scores, would these all be predefined or based on pages visited prior?
For instance. If visitor XX was to first visit a high cost product page, then move on to a product with a lower price tag, in contrast to the visitors initial interest, the ROI would end up being lower if the visitor was to purchase the later (cheaper) product. Then visitor YY visits the same site, but proceeds directly to the product page of the less expensive product and purchases it. Even though they purchased the same product, would visitor XX get points deducted when visiting the product page of the cheaper product based on his clickpath when compared to visitor YY?
Thanks for the info!
Matthew
Posted by: Matthew_Niederberger  October 22, 2007 at 08:32 AM
Gary,
This is a brilliant analysis of the new Score feature by WebTrends. I was thrilled when I saw the announcement, too. (It's funny, though, that I learned far more from your post than from WebTrends  and I am a current WebTrends customer... but that's another story)
Like Matthew, I had to reread a few of your paragraphs as well to understand your points, but that's probably the nature of the complexity of your topic.
Thanks!
Posted by: Toronto SEO  October 22, 2007 at 07:22 PM