Faceted search is at the heart of the ecommerce site experience. But despite its central role, the complexities of measuring and understanding faceted search behavior have contributed to make it both under-studied and under-optimized. In my last post, I described a basic data model for faceting that was designed to capture non-lossy, detailed facet behavior for analysis. While that data model is a significant improvement over the basic storage of facet data as it’s typically collected, I would expect that any competent analyst could and would design similar structures. In today’s post, however, I’m going to describe aggregation strategies for faceted search across three different dimensions – all of which are typically missed or ignored.
Aggregating facet data can help analysts understand how faceting works in particular situations and it’s useful, too, as a guide to how to think about faceting performance. Keep in mind that facet performance is always relative to particular categories. There’s no reason to believe that faceting works the same way (or as well) regardless of whether it’s applied to big screen TV’s or children’s play structures. The category specific nature of faceting is a function of several different factors. The facets are actually different for each category, but even when the facets are the same, the usage of them and their contents will differ. There may be more or fewer brands in a given category and those brands may be better or less well known, price ranges will be different, etc., etc.
So if you want to track and optimize faceting performance, you have to track faceting relative to individual categories. This first aggregation is designed to capture, at a high-level, how faceting is used within a category. The goal here is to understand how often faceted search is used for a category and how this compares to overall category consumption:
- Total Category Views, Carts, Purchases
- Category View, Cart and Purchase Performance overall
- Category View, Cart and Purchase Performance when Faceting is Used
- Category View, Cart and Purchase Performance when no Faceting is Used
- Number of Faceted Searches
- Avg. Facets per Session
- Avg. Facets per Visit with a Product Detail
- Avg. Facets per Visit with a Product Cart
- Avg. Facets per Visit with a Purchase
- Avg. Products Returned
This data is simple and while it isn’t particularly actionable it can help answer questions around which products get more or less usage of faceting and whether, at a high level, faceting performs better or worse than you might expect in this category.
The next set of data is an aggregation designed to capture the order in which facets are applied and the impact of faceting on the size of the product list. This is an area where aggregations are very lossy, so be warned that to really analyze these questions the detail data is eventually going to be necessary.
- Category (key)
- Facet Position (key)
- Facet Type (key)
- Facet Value (key)
- Count
- Avg. Number of Items Returned
- Avg. Difference in Items from Previous
- # of Views
- # of Cards
- # of Purchases
- % of Time Top Item Viewed
- % of Time Facet Removed
What I’ve done here is build a cube where Category, Facet Position (1-n facet applications), Facet Type (brand, price, etc.), and Facet Value (Sony, Samsung, etc.) are the keys. For each unique combination, you get a set of metrics that help a merchandiser evaluate how well this facet performed in this position to both narrow the search and drive purchase funnel behaviors. This cube can help a merchandiser figure out which facets work best in which positions – something that can be influenced in the UI.
The goal here is to support quick analysis and reporting (in tools like Tableau) of how effective each facet type is when used in a category. That’s a set of questions that merchandisers have, historically, had a hard time answering but an aggregation like this can make visual exploration of this data both possible and convenient. Depending on the number of facet types and values, some consolidation of little used options may be necessary to control cube volumes. But most categories for most retailers will produce very manageable cubes that can support rapid visualization.
Understanding faceting at the category level feels very natural, but it’s not the only type of aggregation or analysis I’d suggest. I think it’s probably more interesting to understand how faceting works from a product perspective – and this is a view that is particularly interesting as a way for a retailer to help manufacturers understand product choice behavior.
From a product perspective, what I want to understand is when and why a product gets eliminated from the consideration set. When you start in a category, every single product is in the category set and has an order based on the sorting criteria currently set by the user. Each time a user applies a facet, the product position in the search set can change and, of course, the product may drop out altogether. For illustrative purposes, I’m just going to focus on the dropping question. An aggregation very similar to the one above but set only for cases where the product is eliminated from the consideration set can make this type of exploration possible:
- Product (key)
- Facet Position when eliminated (key)
- Facet Type when eliminated (key)
- Facet Value when eliminated (key)
- Count
- Avg. Number of Items Returned
- Avg. Difference in Items from Previous
- Avg. Position of Product in Previous
- # of Views
- # of Cards
- # of Purchases
- % of Time Top Item Viewed
- % of Time Facet Removed
This tells me which facet types and values visitors used that eliminated a product from consideration – and when (by position) they were used. A variant of this would be to add average position of product in the list as a metric and create this cube for every combination as we did above for category. That’s great for tracking which facets tended to produce the biggest changes in search position for a product but not so good for answering the question around what facet eliminated a product from consideration.
The last type of facet aggregation I want to suggest is for customers. The way someone facets tells you a huge amount about what they are interested in and how they shop. You can and should find ways to cull out some of this data and attach it to the customer record. Of course, when you’re talking about attaching faceting data to the customer record, you’re committed to aggregating that data into simple, singular fields that can be profitably joined to a customer id. That’s challenging! It may be that you want to consider two different alternatives: building a model of customer faceting behavior off the detail data and using model codes at the customer level versus finding truly interesting single field aggregations. While the former is probably more complex in practice, it’s fairly easy to understand. Instead, I want to suggest some simple field type aggregations that I think are interesting.
Let’s start with whether or not a customer facets on brand, price or feature, how often they facet on each, and which tends to drive their final consideration set. Since I know that faceting is category specific, I’d be inclined to make these fields indexical. What I mean by that is that I’d index the user’s propensity to brand search versus the overall propensity of people to brand facet in the categories they’ve actually searched. Having these three propensities seems to me to be a powerful way to distinguish brand focused from price sensitive shoppers.
Depending on your business and the cardinality of your product set, it might also make sense to flat out track brand propensities at the customer level. If you’re trying to extract data for display re-marketing (for example), this is a great way to do it.
There is also considerable learning to be had in the actual price faceting a customer uses. One challenge to price faceting is that every category will have a different range of prices. You can manage that by dividing up price facets into quartiles (for example) for the category. Taking this approach, you can classify any facet application as selecting one or more quartiles of price within the category. Counting how often a customer price searches by quartile and how often their product decision is in one of these quartiles is a powerful indicator of how they shop. I’m not saying that every customer is consistent in price strategies across category, but I do think that shoppers are more consistent than not. A value shopper in TVs is quite likely to be a value shopper in stereos. And someone who buys a high-end but not top-tier hotel room is likely enough to make the same kind of decision when renting a car.
Beyond the range of price or the actual values of brand, this is an area where the order in which a customer applies facets may also be significant. Do they think “brand”, “price” or “features” first? You can capture a counter of this and have a really interesting take on how a customer approaches shopping.
In truth, I’ve only scratched the surface with the ways you might build out faceting and create data sets for modeling and analysis. Faceting is so rich and so interesting that it will reward many different views and many different types of analysis. All the more shame then, that it is one of the most neglected aspects of the site experience when it comes to optimization.
Comments