My Photo


  • Clicky Web Analytics

Your email address:

Powered by FeedBlitz

« Engagement in Web Analytics - Redux | Main | X Change: Web Analytics NOT for the Masses »


An excellent post on a difficult issue -- the bridging of behavioral and traditional segmentation. I did have a couple of comments and thought I will pass them along -- never been shy on that account :-)

I am still a novice in the web analytics area (learning a lot from the gurus in the field) and so my apologies in advance for any misinterpretation of your statements or erroneous conclusions. I will admit however, that I have done my share of traditional segmentation and other analytics.

My first question is why would anyone one want to predict traditional segments using online behavioral data or vice versa. For one, (in my opinion) in addition to the points you mention in your post, segmentation is also driven very much by what you are trying to accomplish with the segments you will end up with (using cluster analysis/Neural Networks or any other method). The other reason being that I have always percieved the online behavioral data as an extension of the data sources used in the traditional segmentation (much like data from any other channel). From this end, I have always strived to improve the traditional segmentation to the point that it can also explain and be used for online Marketing. Again, my perception could be misplaced.

When it comes to segmentation using online information, I too have tried to approach it from various directions. The first one entails pooling all the data about a visitor together from all sources (online, demographic/psychographic, firmographic, transactional etc.) and then picking relevant variables (either manually or using the variable selection feature in data mining tools) based on the objective of the segmentation. Once you have your sample with all key variables, you can use either cluster analysis or one of the many methodologies to create segments. And lastly, as you mention generate the profile and appropriate description/label for those segments.

The second approach I have tried is to go through the behavioral data using the same approach as traditional segmentation (using similar methodologies as cluster analysis). Once this was done, and pure behavior based profiles completed, I used the traditional segments and the demographics/firmographics aspects to sub-profile the segments. This was done less to see the overlap between the traditional and behavioral segments but more so from a contact strategy perspective.

The catch to the above approaches as you point out is the ability to link behavioral information at the visitor level with other information. I do believe this is not easy but definitely possible It will depend on what kind of customer infrastructure you have in place. If you are lucky, your company might have an individual level link between your online visitors and your traditional data. So for example, you will know that John who came to your site and did XYZ also interacted with these other channels, used so and so services/products, belongs to this industry, is a SOHO, belongs to a company with $$ sales, and if he was part of a survey then he responded in a certain manner etc. etc.

Anyway, just some random thoughts on the segmentation topic. Please feel free to shoot holes in my approach or thinking.

“To know yet to think that one does not know is best;
Not to know yet to think that one knows will lead to difficulty.”
- Lao Tzu

Hi Gary,

Very interesting post. I look forward to reading the sequel posts.

I realise the great value we would derive from combining “traditional” segmentation models with online behaviour. It is the next step in marrying the ‘what’ with the ‘why’. Those companies master this integration of research source stand to gain a significant advantage over their rivals.

I agree with you that applying a segmentation scheme to all visitors would be extremely challenging. I think the most we can realistically aspire to (given current technologies) is a representative sample by joining the survey data to the behavioural data.

However, I assume that in most cases the “joint” between the survey ID and the web analytics data is a cookie. With time this joint will suffer degradation as visitors delete their cookies.

So you are left with two main options:
1. Continuously run the survey (which might annoy site visitors unless traffic levels are extremely high so you can keep launch rates relatively low)
2. Use your sample data periodically for a relatively short period of time(until the next time you run the survey)

Nielsen Online offer a product, Market Intelligence, that joins web analytics with online survey data. Unfortunately, the web analytics capabilities are pretty basic. Nonetheless, MI enables you to combine the different survey variables to create segments on the fly. You can then superimpose the selected segment on the top level web analytics behavioural data.

I’ve used MI to collect mainly demographic and some psychographic data on the biggest media websites in Israel. It was fascinating. We got over 10,000 survey completes within days of launch. However, within two months the panel had to be refreshed. The refresh wasn’t hard to do technically but did raise issues of data consistency with some people.

Instadia, which was bought by Omniture in early 2007, was another web analytics tool with integrated survey capabilities. I believe it was Omniture’s intention to introduce this product (under the name ClientStep) as a SiteCatalyst plug-in some time ago (I last spoke to them about it around November 2007) but cannot remember any announcements made.

Is the cookie deletion problem something you’ve encountered as well? Would love to hear more about your experience with this matter.

Michael Feiner


Great point. Cookie deletion is a big issue - not just with the join - but with segmentation in general. The loss of long-term tracking data significantly limits the reach of behavioral segmentation. This is hardly a problem unique to visitor segmentation but it certainly does have an impact. A behavioral segmentation solves the problem of applying the segmentation to all visitors (but doesn't resolve the cookie issue). But behavioral segmentations have their own issues - which is what I'll be talking about next.

Thanks for the thoughts!



That's an interesting point about whether it makes sense to try and predict traditional segments from online data. I think it does make sense - it's just not always possible. Why would you want to do this? From my perspective, I'd like to be able to incorporate the company segmentation into my online reporting and analysis. When a mass-media campaign is running (for example), I'd love to be able to say that it drove Segment X more than Segment Y and Segment Z not at all (or some equivalent story). This is really interesting stuff - and applicable across a wide range of reporting needs. At some level, I'd also love to be able to incorporate these segments into analytics. So if I'm studying the impact of a site tool, I could say it worked well for Segment X but not Segment Y. That's potentially both interesting and useful. However, I typically can't do any of that unless - from behaviors - I can infer what traditional segments visitors belong to; I have to do this because I don't have the survey data to apply to tool users or campaign sourced visitors. Make sense?

It's really interesting that you've approached this problem from both directions just like we have. What's been your experience about which direction worked better? I'd love to compare notes sometime. I think this is one of the most interesting and challenging tasks in web analytics. And I believe that lot's of companies could significantly improve their online segmentations.

Thanks for the thoughts!


I think we are on the same page but just coming at it from different semantics. In terms of incorporating the company segments into online reporting or observing which segment(s) benefited from an online campaign -- I am completely in agreement with you. I just never thought of it as 'prediction'. Maybe because I had the luxury of not having to work with cookies but actually had a unique visitor id for most customers which I could use to create the link between the online data and traditional/transactional data and segments.


The comments to this entry are closed.