It’s easy to see how Marketers use demographic data. Demographic data points are the stuff of everyday life – the way we think about people and categorize them in our heads. If you’re striking up a conversation with someone, you’re automatically noting how old and what sex they are (age and gender) and adapting your conversation appropriately. If you don’t know them, you’ll probably ask questions like “What do you do?” (occupation) and “Where did/do you go to school?” (education). You’ll note much more, of course. How they look, how they dress, how they smile, how they talk – the whole realm of social cues that we all learn as children and go far beyond but still include those core demographics. It is the great strength and weakness of demographic data for marketing that it is foundational; demographics are easy to understand, universally applicable, but oh so broad.
Web behavioral data doesn’t work this way. Viewing a page on a Website, doing a search on Google, even buying a product; none of these generate an immediate, intuitive view of the person.
When we first started building behavioral segmentations with Web analytics, we used the straightforward variables you’d expect: number of visits, number of purchases, number of page views, time on site. What we got – invariably – was a “pyramid” segmentation. At the bottom was a huge base of one-and-done site bouncers. From there, each segment was a small step up in total behavior until you reached the top of the pyramid and a small group of visitors with many visits and many page views. It’s a spectacularly useless segmentation because all it captures is total usage of the Web site.
If you think about, that’s hardly surprising. Every variable I listed above is heavily co-related. They are all usage variables or deeply related to usage. In the DM Radio Predictive Analytics webinar I did last week, I mentioned a story (true) about site where an analyst had correlated page views to orders and found that viewing 4+ pages was highly correlated to conversion (signing up for a free trial). When we looked at the site, we found that the conversion process took at least 4 pages, and that a majority of visitors who signed up for the free trial did so immediately upon landing.
In other words, that analyst had essentially proven that there was a strong correlation between filling out the form necessary to get a free trial and having a free trial.
Anytime I get a request for an analysis of this sort: “How many pages does it take to get a conversion?” I know I have to try and redirect the question. Because how many pages it takes to get a conversion is dependent on three far more important variables: who the visitor is, what they were trying to accomplish when they entered the Web site, and what the pages viewed actually were.
For a visitor familiar with the brand and ready to order, the number of page views it takes to convert ought to be about equal to the number of pages it takes to place an order and the types of pages viewed ought to be primarily form pages (Functional Converters).
For a visitor trying to figure out what our product does, the number of pages views it takes to convert is necessarily going to be much larger and the types of pages viewed ought to move from Informers and Explainers to Convincers and, finally, to Converters.
The right question to be asking is something more like this: of the visitors coming to our Web site, how many are in each of the following (sample) categories:
- Determined to Convert: Will buy unless the Web site fails
- Ready to Convert: Want to buy but could be put off by a poor operational experience
- Shopping: Ready to buy but still comparing/shopping around
- Persuadable: Potentially ready to buy but might not make any purchase at all
- Researching: Investigating a product but not ready to make any purchase
- Browsing: Finding other information that is related to a product but not in any form of shopping mode
For most of these visit categories, there will be multiple types of visitors. Visitors who come to a site to Shop may be new or existing customers and they may be primarily driven by Price, Availability, or Convenience.
The combination of Visit Intent (what they came for) and Visitor Type (who they are) is the two-tiered Segmentation (Visitor and Visit Type) that I’m going to be talking a great deal about in this whole series.
For each combination of visit and visitor types, there will be a sweet spot of maximum effectiveness in terms of the amount of content and the type of content that works best. This third piece – the types of pages viewed – is the place where meta-data becomes critically important.
In Web analytics, the only “fact” we always know about a page view is the page name. Analysts use implicit meta-data from the page name all the time. We know the difference between these two pages:
/products/aisle/product/pricing
/information/aisle/how-to-use-productX-for-Ying
For real analysis, however, you need to translate that implicit meta-data understanding into substantive analytic categories. There are as many different ways of building meta-data around pages as there are analysts – and it is in the construction of meta-data that the art of analysis happens. There is no fixed, right approach to creating meta-data about a page.
There are, however, some fundamental, obvious, and nearly always useful ways of creating page meta-data:
- Functional Taxonomy: Describing what the pages is supposed to be doing in the broader site-structure
- Site Taxomony: The hierarchical levels that the page occupies (e.g. Products/Detail)
- Product Taxonomy: The product/family the page concerns (e.g. TVs/LCD/ModelX)
- Topic Taxonomy: A topic coding of the copntent (e.g. International Affairs/Middle East/Egypt/Revolution)
- Audience: The visitor segments the page is designed for (e.g. All, engineers, consumers, health-care providers, professionals, etc.)
- Sales-Stage: The place in the sales stage the content is direct to (e.g. Early, Middle, Late)
- Page Components: The modules the page contains (e.g. videos, images, reviews, etc.)
- Component Classification: The value or status of the page or component (e.g. Overall Review Rating is High or Low, Price is Discounted or List, Availability is Out-of-Stock)
- Content Cardinality: The amount of line-item content on a page (e.g. Number of Search Results Returned, Number of Products Listed, Number of Reviews)
- Page Length: The number of words or screens of text on the page (e.g. 800 word description, 200 word article, article in 3 pages, article in 1 page)
- Content Source: The publisher, source, author of the content (e.g. Columnist X, Database Y, blogger Z)
- Publish Date & Days since Changed: The recency and freshness of the content
The richer your meta-data coding, the better your analytic opportunities. With this kind of page meta-data, you can answer common and critically important questions about site design:
- Does having a video on a page improve performance?
- Do negative reviews alter page function – at what point?
- Did a Product Detail perform poorly because the item was out-of-stock for most views?
- What’s the optimal length of a Product Sales page?
- How often should content pages be rotated or refreshed?
- What type of content do early-stage visitors want?
- What’s the optimal number of products to shop on a search result by category?
Even better, with this type of meta-data, you can build and analyze segments that actually mean something. You can focus in on a group of visitors who consume health-care professional content and figure out what matters to them. You can identify the group of visitors who care about International Affairs and understand how long they like their articles and which authors and sources they prefer. You can find the group of Category X buyers who are primarily concerned about price and not convenience.
Meta-data is at the heart of all Web analytics. The behavior we have is views of Page X. The meta-data about Page X provides the engine with which we will infer the Visitor Segment, the Visit Type, and the type of page and page characteristics that works best for any specific combination of the two. It is page meta-data that forms the essential bridge between Web analytics and targeted marketing, and the richer the page meta-data, the better and more obvious will be the opportunities for using that data – whether for analytics or targeting.
When it comes to good Web analytics, it really is ALL about the meta-data.
In my next post in this series, I’m going to digress briefly and show why this focus on meta-data is the biggest differentiator between a good Web analytics implementation and a poor one – and why it’s such a critical advantage when it comes to implementation that we at Semphonic actually do analysis. Then I’m going to take up the Two-Tiered Segmentation Scheme, show how it works, and how it forms the basis of any good online reporting system, drives critical site analysis, and provides an essential bridge from classic Web analytics to a broader view of Digital Analytic marketing.
I've been working on a new meta data taxonomy for my company's website -- these are great insights. It seems that meta data is a real weak area for many analytics implementations. But now that GA offers custom variables, there's really no excuse for broader adoption. Thanks for the post. -Carson
Posted by: Carson_smith | February 28, 2011 at 01:10 PM
Great post!
Another implementation that really helps is linking user ID to their actions, conduct individual-level analysis and present aggregated the results.
Posted by: Eric | March 06, 2011 at 03:31 PM
I think I skimmed over this post a few weeks ago, thought I agreed with the general gist of it, nothing new to learn. But I just reread properly and must admit I hadn't thought of half of those types of page meta-data. Obvious in hindsight but never occurred to me previously.
And I agree, most are unlikely to ever appear in your regular reporting, probably unused in most analysis but could be absolutely critical in answering certain business questions. Looking forward to reading the rest of this series.
Posted by: Peter O'Neill | April 01, 2011 at 06:41 AM