Not too long ago there was an interesting thread on the Yahoo Web Analytics Forum that began with the question: “Why does testing work?” As any child knows (or soon learns), with “why” questions, good answers are surprisingly hard to come by. What seems obvious often turns out to be complicated and obscure.
I wanted to start this series on the emergence of Digital Database Marketing with a similar question – “Why does Web analytics work?”
This particular “why” question has two claims that are implicit. First, that we know Web analytics IS and second, that it does, in fact, work. I’m going to say that we do know what Web analytics is – it’s the study of actual behavior on a Web site with the purpose of first understanding and then optimizing the use of that Web site and the broader online channel. There are, of course, many different methods for doing this type of study, and it’s obviously going to be the case that some methods work better than others and some don’t work at all. However, I think that all Web analytics techniques are based on a single, simple pair of methods and that these methods, properly applied and endlessly extended, do work.
When we do Web analytics, the essential behavior we see is a trail of where a visitor went in the virtual space of a single site. You can think of a Web site as a series of places – each connected by pathways that vary in size.
Let’s imagine a Web site that’s setup as a maze. Suppose every passage way was essentially unmarked – with every link labeled “Go to another Room” and the contents of each being nothing more than a room number and a set of boxes that contain the “Goto” links. It’s doubtful that a visitor would actually traverse such a site, but let’s imagine that we recruited a group of students for a study and asked them to do so. Since the content and the links are essentially undifferentiated, we’d expect that players would choose without motivation. However, the arrangement of the boxes might impose a certain order on the maze.
Based on the presence of the links in boxes near the top of the site and the size and position of the boxes, we’d find that some paths are much more traveled than others – and we could probably infer the combination of presence, size and position of the boxes that created the most traffic to any given room.
This method would give us a very primitive type of Web analytics; a Web analytics in which all we are doing is understanding how the navigational structure of a Web site impacts random travelers.
Of course, Web sites aren’t setup this way. They aren’t undifferentiated mazes – they are places created to provide specific types of content to visitors. Because of that, links are intended to help steer visitors to the “right” places and the content itself is designed to achieve some function (explain, convince, sell, entertain, etc.). What’s more, we don’t pay visitors to come to our sites as part of a research project – they have to want to be there and they have to want to go to the places on the site that we suggest.
We assume, therefore, that when visitors navigate to a place, that they did so with “intention.” Suppose that instead of five “Go to another Room” boxes, I instead present the visitor with five boxes each of which is labeled “Find out more about Product Type X” where X is one of five products: “Universal Tea Makers”, “Red Hockey Pucks”, “Fancy Hot Chocolate Mixes”, “Football Binoculars”, and “Cinnamon Banana Snacks.” If a visitor clicks on the “Red Hockey Pucks” link on my home page, I would normally infer that of the five products I sell, that’s the one they are most interested in.
I don’t want to glide over this step, because it’s at the heart of nearly every Web analytics technique. We assume that a visitor’s choices on a Web site express something about their intentions. We understand those intentions by adding some meta-understanding about the content they viewed.
Typically, we do this in a very simple, shorthand fashion. For example, what we are really doing in this “Red Hockey Pucks” case, is matching up link names to intent. If the five boxes on the home page were just labeled “Goto another Room”, then regardless of the fact that the resulting page content is different, we couldn’t infer intent. Likewise, if we switched links so that the link that went to “Red Hockey Pucks” was labeled “Cinnamon Banana Snacks”, we wouldn’t infer that the visitor wanted “Red Hockey Pucks” just because they went to that content.
It’s also true that the availability, size, position, and design of the links and boxes may influence visitors. Along with intent, we have to consider the first method of Web analytics that mapped the natural pathways of the Web site. If the “Red Hockey Pucks” box is huge and colorful and the other four are small, below-the-fold, and dull, then some visitors who might have picked “Universal Tea Makers” or “Football Binoculars” are going to pick “Red Hockey Pucks” instead. So if we know that the “natural path” to “Red Hockey Pucks” is larger, we might expect that some percentage of the visitors there might have picked a different product if the structure were different.
And, of course, if we had a sixth product (“Large Concrete Beach-Balls”) that isn’t displayed on the Home Page at all, then we have no simple way of assessing Visitor interest in Concrete Beach Balls relative to Red Hockey Pucks.
All that being said, the “assumption of intentionality” combined with an understanding of the “natural structure” of the Web site are the two fundamental principles of Web analytics method. Here are some examples of common Web analytic techniques that further illustrate the point:
Technique |
Behavior |
Inference |
Link Analysis |
A high percentage of visitors who click on the Universal Tea Makers link immediately back-out of the target content or search for something in the link name. |
The link name and the content are mismatched. The click on the link name is inferred to show an interest but the content isn’t consumed. It may be that people don’t know what a Universal Tea Maker is! |
Real-Estate Analysis |
The Red Hockey Pucks link is favorably positioned, large and well designed but draws fewer clicks than the Football binoculars link that is plain-text and below-the-fold. |
Visitors are much more interested in the Football Binoculars. A better match of site real-estate to visitor intent would make the web site navigation more natural. |
Funnel Remarketing |
A visitor added Fancy Hot Chocolates to the Cart and then abandoned the shopping process. |
The visitor is interested in hot chocolate but wasn’t willing or ready to buy. Sending them a discount offer might convert them. |
Navigational Gaps |
Visitors search for Concrete Beach-Balls on the Home Page. |
The lack of Home Page Navigation is forcing visitors to use internal search to find what they want (Concrete Beach Balls). If the behavior is common, it should be added to the Home Page. |
I could go on and on, because as far as I am aware, EVERY single Web analytics technique depends on some combination of the assumption of intentionality and an understanding of the “natural structure” of the Web site. This is what we actually do when we are doing “Web analytics.”
I have no doubt that both methods are, in their essence, valid. But it’s also true that the power of these methods is dependent on how skillful the analyst is in using them, and the methods themselves often rely on subjective evaluations.
For example, I made the obvious inference that clicking on the “Red Hockey Pucks” link means I’m interested in “Red Hockey Pucks.” Few analysts would be inclined to disagree though it might not be correct. Suppose I extend this method by saying that a visitor who views “Red Hockey Pucks” and “Football Binoculars” but not “Fancy Hot Chocolates” or “Universal Tea Makers” is interested in sports not food.
Perhaps; there are, however, an endless number of equally possible alternative categorizations that might also fit. If these are the two cheapest products, what the visitor may be interested in is products under $10. That would be a completely different interpretation, but it might just as easily be true. Or perhaps the visitor likes red and these two products had red pictures. In retail, color is often a vital dimension of visitor intent.
The world doesn’t provide the analyst appropriate or meaningful categorizations of the data. They need to be constructed and tested – and one of the great arts of analysis is finding the most interesting meta-data categorizations that can be applied.
In addition, I’ve limited myself to talking about the simplest and most basic metrics available to the analyst – the pages viewed (or links clicked). Other metrics, like time-on-page, internal search term, product category and price, external search term, and user-actions – the whole panoply of metrics and measurements - all provide additional refinements on our ability to infer intention or understand natural structure. Which of these metrics we use and how we interpret them is, once again, a matter of the analyst's art.
I’m hoping that most of what I’ve written here is going to be taken as obviously true. Perhaps so obviously true that it hardly bears repeating. However, I think the while we all use these two basic methods, we don’t necessarily think much about what we are doing. By making them explicit, I hope to add clarity to the ensuing discussion. In my next post, I’m going to take a similar look at traditional database marketing and show how, at this very basic level, the core techniques are nearly identical.
How important do you think it is to be able to track individual visitors as they navigate through a site (as opposed to totals, e.g., 100 people arrived at this page, 15 clicked on X, 50 clicked on Y, etc.)?
Posted by: Greg | January 19, 2011 at 05:20 PM
Greg,
One pretty fundamental divide between traditional database marketing technique and traditional Web analytics is that the former is more focused on Visitor-level data and the latter on aggregates. It's interesting, however, that whether you are fixing site problems detected at the aggregate level or targeting individuals, you're still working within the same basic principles (I think).
For many classical analysis purposes, individual level tracking isn't necessary. However, that isn't universally true. There are plenty of site analysis techniques that only work when you have individual data. And, of course, for any kind of digital database marketing, it's pretty much essential to go down to the individual level.
So I guess my answer is that it's very important indeed! And as I continue this series, I'll be focusing more on how to take data at the individual level and make it interesting.
Posted by: Gary Angel | January 19, 2011 at 09:38 PM