Be sure to check out the interview with Charles Schwab’s Dennis Bradley – run on Manoj’s blog (http://www.webanalyticsworld.net/2008/07/x-change-interview-dennis-bradley-of.html) since I'm pretty sure Manoj gets wider circulation than I do! I’ve worked with Dennis for more than a few years now – and it’s always been a pleasure. His practical, common-sense approach to measurement is refreshingly realistic and always productive. If you're managing a serious enterprise measurement effort, Dennis is someone you really want to talk to. Check out his X Change interview and I think you’ll get a great sense of how enjoyable and useful his Huddles will be!
Segmentation Overview
For many years, marketing professionals have relied on a set of analysis techniques designed to help them understand the demographic and psychographic profiles of their customers and prospects. These traditional segmentations are usually derived from complex clustering techniques that map rich primary research data (usually survey based) into common groups or profiles. These groups are then given highly descriptive business names and rich descriptions and provide a framework for a wide range of marketing activities. Though such segmentations can be (and are) applied to online customers, companies that have tried to map these segmentations down to the individual level (for targeting or reporting) in the online world have mostly been disappointed. In Part I of this series, I described the biggest pitfall in extending these segmentations – the near impossibility of mapping demographic and psychographic profiles to visitors about whom we typically know nothing except their online behavior. In Part II, I discussed the advantages and disadvantages of building behavioral segmentations. In Part III, I covered different strategies for joining survey data to behavioral segmentations, when each is appropriate, and why the join is necessary at all. In Part IV, I covered basic data transformations for segmentation – focusing on describing visitor-level topic interest. In this post, I’ll cover using Functionalism and Session-Analysis to try and capture how a visitor uses a web site.
The most common web behavioral transformations capture the amount of usage a visitor has on a web property (total page views, visits, time, etc.). Transformations based on the content taxonomy capture another critical dimension – a visitors topic interests. But there is a third dimension of a visitor’s site usage that is poorly captured by either of these; namely, the type of session(s) a visitor has.
A visitor’s type of session encompasses a whole array of interesting questions and facts: was a visitor browsing or looking for specific information, was a visitor confident about where to go or confused, was the visitor concerned about the company/brand or confident, was the visitor in early shopping stages or later-stages, and many more equally important facts.
These may seem impossible to infer from purely behavioral data, but, with the proper data transformations, many of these types of behaviors will emerge quite clearly in a visitor segmentation. The key is having a rich non-topic site taxonomy that describes the role different pages play on the site. For this purpose, we use Functionalism.
Functionalism is a powerful, general-purpose public-domain methodology for understanding how to measure the pieces of a web site. In concept, it’s quite simple. When you build a web page, you have a purpose in mind for it: getting visitors to the right place on the web site, convincing a user to buy, getting a user through a conversion process, re-assuring a user about your company, providing generic information about a product or service so a visitor doesn’t leave your site to research, saying thank-you for a lead or purchase, etc., etc.
These roles are captured in Functionalism as a set of functional types – Routers, Convincers, Converters, Re-Assurers, Informers, Completers, etc. And for each functional type, we have a corresponding set of appropriate measurements designed to capture how well that page is performing. Simple.
But the underlying power of the concept is considerable, and it turns out to be extremely useful in visitor segmentation. Here are two sample visitor sessions mapped at the page level:
Welcome, Products, Strategy & Research, Select List, Advice, Banking, Mortgage
Welcome, Products, Mutual Funds, Solutions, Company Funds, Balanced
These two sessions contain about the same number of pages – but they are dramatically different types of sessions. How do we know – and more importantly – how can we capture the difference in a way that a formal analysis will understand?
Here are the same sessions using Functional Categories for each page:
Engager, Router, Router, Convincer, Router, Router, Convincer
Engager, Router, Convincer, Convincer, Convincer, Convincer
In the first session, the user has navigated across the site and only dipped into each area. This is a classic “unfocused browsing” visit. In the second session, the user has moved immediately into an area of interest and spent the entire time in the sales pages for that area (“directed interest”).
Here’s another sample session:
Welcome, Research, Mutual Funds, Stock Funds, Index Funds, Screener, Fundamental
Welcome, Products, Mutual Funds, Key Fund, Learn More, Large Cap, Affiliate
Both these sessions are topically concerned with Mutual Funds. But an analysis of the Functional Page Types reveals them to be fundamentally different:
Engager, Router, Informer, Informer, Informer, Tool, Informer
Engager, Router, Convincer, Convincer, Closer, Convincer, Closer
In the first session, the pages are about Mutual Funds. In the second session, the pages are about specific products offered. The difference might not be well captured in a taxonomy (site content categorization) but is completely encompassed in a functional description.
It’s possible, of course, that one could build a site taxonomy that captured both these variables in a single place. But in general, it’s cleaner not to. By separating out function and topic-interest, you get taxonomies for each that are clean and powerful both individually and in combination.
The obvious data transformations based on Functional Type include simple counts by page type, total time by type, and avg. time by type. But treating these variables as visitor-level roll-ups can miss one of the most interesting facts about a visitor – the extent to which their sessions are homogenous or diverse.
To capture this, you need to profile each session and then aggregate session types at the visitor level. Profiling sessions can be done with an independent session-based segmentation (useful in it's own right) or by simple rule-based classification. Either way, what you’ll aggregate at the visitor level is a count by session style.
Counts by session style are interesting because they capture two different things about a visitor – their most current state and the mix of sessions they exhibit. For shopping and lead gen sites, this mix of sessions can help re-segment and classify visitors as they move through a sales cycle. For media sites, the mix of different types of sessions a visitor has help define how broad a visitor’s engagement with the web property is. This information is also useful for customer data warehousing. Saving the most current session style as well as the session style counts for a visitor is a remarkably efficient way to capture visitor life-cycle and potential messaging opportunities in the customer data record.
The mix of pages is not the only interesting data transformation that relies on a functional taxonomy (though I’ve only touched on some of interesting mixes – depending on the site, the mix of re-assurers / closers, engagers / routers, completers / converters / convincers, closers / convincers, informers / routers, and more might all be significant). Another interesting variable is the amount of time spent on a router page for a section relative to the amount of time in articles or convincer pages.
Comparing the amount of time that different visitors spend on a single router page is equally interesting - because it can be a measure of how directive the visitor is. Short router-page times generally signal that a visitor knows what they are looking for. Longer times generally mark-off visitors who are less focused or experienced with the site.
So time spent by page classification and the ratios of time spent by page classification both turn out to fairly powerful segmentation variables.
In the Functional paradigm, Internal Search is considered to be a special type of router – and there is another data transformation that can help sharpen a visitor segmentation for internal search users. We like to sub-classify Internal Searchers by grouping searches according to broad keyword categories. Typically, we’ll also try to identify popular searches that are “broad” and classify all other searches as “directed.” For the segmentation, we’ll classify the Search Results page as either Broad Routing or Directed Routing based on this keyword classification (which also allows us to record an interest taxonomy view for the search). If we don’t have the ability to do this, we’ll generally classify search as a Directed Router.
Most media sites include some type of role-based taxonomy (Home page, Tool, article) – even if isn’t as rich as the full Functional specification. For other types of sites, this is usually missing. It’s obviously a convenience to have this information flow through in the tagging, but it’s hardly what we expect. In most cases, the analyst will to overlay this data on the raw feed as one of the data transformations. It’s a fair amount of work for larger sites, but the results are worth it. It is in the cross-matrix of visitor interests and session-styles that most of the truly interesting profile facts about visitors actually emerge.
Adding these session-style profiles to a visitor segmentation is where behavioral segmentations really begin to diverge from traditional psychographic ones. The nature of the information becomes fundamentally different (in a way that isn't true for topic interest) and the richness and actionability of the data is greatly enhanced. These types of variables help make behavioral segmentation a powerful tool for subsequent web analysis of every sort since they capture the fundamental engagement styles users bring to the mechanics of web navigation.
In my next post, I’m going to cover more data transformations – focusing on ways to capture time-based elements in a visitor aggregation.

Comments