In Matthew Fryer's Huddle on "Getting the Data to Tell It's Secrets," the broad consensus of the group seemed to be that deep-dive analysis is more art than science, that it is largely exploratory, and that its value is necessarily hard to predict. This is all certainly true with regards to the current state of digital analytics, but none of it, in my opinion, should be true.
There are analytic techniques that are fairly standard, highly-repeatable and extremely likely to add value. It isn't impossible to imagine such beasts. Response modeling in targeted marketing is a good example of a repeatable, obviously valuable analytics method. So is market basket analysis in the grocery and retail shelf world. These are true analytic techniques, they are exploratory, but they are done in a highly standardized fashion with the full expectation that they will deliver value of a specific kind.
Is there a digital/Web equivalent?
I think Attribution Analysis is starting to emerge as one potential equivalent. In my recent short Facebook product write-up of ClearSaleing, I lamented that good attribution systems were forking off from Web analytics. As I've thought about this, however, I've begun to think that this process may be inevitable. Web analytics systems (and other general purpose analysis tools) are the systems we use for broad exploration. When we find a technique that really works, it's often more convenient and more elegant to split that system off into a dedicated tool.
Are there other examples in digital measurement or techniques that might be treated similarly to our advantage? At Semphonic, we use two analytics methodologies (Functionalism and Use-Case Analysis) on a highly repeatable basis. Both of these methods work consistently and turn out useful results of an expected type. I've written about both, and while I think these represent the best attempts at analytic standardization in broad digital measurement, both are more generic and require more analyst tuning than I think ideal.
Since X Change, I've been thinking about the problem and I've found several types of analysis that Semphonic has done on a one-off basis that might, with a little bit of effort, lend themselves to true standardization. I think the development of standardized analytics products would be a huge benefit to our industry (and to Semphonic of course). So I'm going to issue a kind of challenge/offer around the methods I'm about to describe.
Give us your data, pay the cloud processing costs (which shouldn't be too much), and throw in a little extra (say 5K for shipping and handling!), and we'll do a full analysis based on one of these methods. My goal, of course, is to quickly generate enough analysis projects to test whether these methods are truly repeatable and to figure out how/whether they can best be done in consistently valuable fashion.
Here they are:
Site Navigational Structure Analysis
Description: Every site has a specific navigational structure that's been designed to achieve a balance between business goals and visitor intent. In one of my earlier posts on Web analytics theory, I described the tension between visitor intent and navigational structure as a core problem in Web analytics. This analysis is designed to take advantage of that tension to help you understand how the customer use of your Website matches/differs from your expectations. The goal of this analysis is to create a complete map of the navigational behavioral structure of your site which can be compared to the actual structure to understand mismatches in design and consumer intent.
Method: We'll model the entry and next page behavior for every significant page on your Website. For each page, we'll identify the actual behavior-based parents and children of that page. Using this data, we'll create a full mapping of the actual "behavioral" structure of your web site. As a bonus (where possible), we'll use this child/parent data to automatically assign functional categories to the pages of your site. The analysis will be entirely mechanical and will cover the entire Web site.
Deliverable: A complete hierarchical mapping of the pages on your Web site based on actual visitor navigation with any possible functional mappings added in. This mapping can be easily compared to the actual site structure to identify significant differences between the way people use your Web site and the way you expected/designed the Web site for their use.
Requirements: A data feed from your web analytics solution. This analysis can be performed on a relatively short time-slice of data - a week would suffice for most sites.
Media or Publishing Site Content Correlation Analysis
Description: One of the staples of our analytic techniques on content-heavy sites is to model the viewing relationships between content categories. Questions of the sort "What type of Content are Visitors who View Type X mostly likely to be Interested In?" and "Does the order of viewing Content (Starting with X or Starting with Y) alter their likelihood of being interested in other types of Content?". These questions of relation and order are similar, in some respects, to attribution analysis. However, answering them produces a different set of actionable tactics focused on suggesting (or helping editors choose the best suggestions) for additional content consumption. These types of answers can also help model the potential value of new visitors acquired by content type.
Method: We use Factor Analysis and Tetrachoric Correlation to identify content associations. Separately, we'll analyze the problem set with order of viewing treated as a content attribute.
Deliverable: A full content type correlation matrix with significant relationships called out. For the order analysis, we'll identify significant ordering relationships (cases where the order viewed changes the content associations).
Requirements: A data feed from your web analytics solution that includes some means of identifying content pages and their category type. This analysis usually requires a short-moderate time slice of data. We'd recommend a week to a month of data.
Customer Support Content Effectiveness Analysis
Description: Most Customer Support Sites have lots of long-tail support content. It's hard to analyze and it's too expensive - unlike conversion funnel pages, the effort it takes to find out which pages are working well and which aren't is simply more than the effort is worth. On the other hand, efficient optimization of the entire content base would yield significant benefits (ala long-tail SEO). What's needed is an efficient, comprehensive method of identifying the best support content for each type of problem (based on internal search keyword or stubbed - two pages prior to content - navigation path) that involves little or no manual effort.
Method: We'll model exit behavior for each type of problem. We'll identify the most common true end-point (the content page from which the user no longer needs to continue their journey). If available, we'll refine the analysis with data from your page-based user-feedback mechanisms. The analysis will be entirely mechanical and will cover every common search term, stubbed navigation path, and content support page on the site.
Deliverable: A list of the entire problem space and, for each problem type (such as individual internal search term), the best content to drive to. For significant problem-spaces (higher volume of search), the best content may be an ordered list of the best content links.
Requirements: A data feed from your web analytics solution that includes internal search terms and some means of identifying support and support content pages. Because this is a long tail analysis, we recommend using a fairly significant time-slice of data - 3 to 6 months is optimal though the analysis can be done on much shorter time-slices for heavily trafficked support sites.
There you have it, three highly structured analytic techniques for "getting the data to tell its secrets." One (site navigational analysis) is appropriate for any Website; a second (Content Correlation Analysis) is appropriate for any content-rich site; the third (Customer Support Content Effectiveness) is targeted to support sites with long-tail content. In each case, we at Semphonic have done the analysis at least once, but we've never standardized the process. We're probably closest with Content Correlation Analysis, since we've done that many times.
I'd very much like to see if these techniques can be done super-efficiently in a repeatable and valuable manner. Ultimately, I think the most important fruits of analytics will be this type of highly-structured and consistently valuable technique. In their absence, it will always be an open question how valuable analytics actually is, and the degree of which companies are willing to commit to the creation and use of an analytics practice will always be limited.
If you're interested in giving one or more of these a try with your data, drop me a line and we can talk!
As always, Gary, a thought-provoking post! While these are techniques that are potentially *repeatable*, to what extent would it make sense for a company to actually repeat them on a recurring basis? As an agency, certainly, it seems like repeatability is a desired goal. As a single company/brand, though, how often would it make sense to repeat these techniques? Are all three applicable to run at the cadence of site updates (do the analysis, update the site/content, wait for collection of new data, repeat...)?
I've shied away from the phrase "recurring analysis" in the past, as that tends to get into mucking up the distinction between performance measurement and analysis. And, it leads to a situation where I'm pointing out that, "If nothing was changed, results are likely to be the same."
Good luck with the experiment!
Posted by: Tim Wilson | September 26, 2011 at 03:04 AM
Tim,
It's a great point - there's no doubt that recurrence is more important to me than it is to enterprise practitioners and that's true for each of the techniques I've laid out (as it would also be for Market Basket analysis in retail analytics). Like Market Basket analysis, most of the techniques I've laid out should be repeated, but they aren't necessarily going to be something you do on a constant basis. They are probably, as you suggest, the types of analysis you would need to match to the cadence of site change - repeatable in the context of significant site changes, new content, or simply a long enough amount of lapsed time to make a potential difference. In truth, there are very few analytic techniques that are consistently employable in any other fashion - though direct response modeling is probably an example of one that is.
As you suggest, if nothing much has changed in the business, how can any analysis be expected to add incremental insight every time it is run? Even something like attribution analysis, while likely to be constant in execution, will probably only yield sporadic conclusions of interest unless a company is unusually dynamic in their campaign strategies.
On the other hand, I think that the having a quiverful of techniques that are nearly guaranteed to yield value is almost as critical for enterprise-side practitioners as it is for consultants. Those techniques tend to carry a lot of water in the organization - making it far easier to justify the exploratory analysis projects and the team necessary to conduct them. No analysis team is going to be successful all the time (or probably even a majority of the time) - so having some high-probability winners is very important.
Posted by: Gary Angel | September 26, 2011 at 09:35 AM