Attribution – assigning the proper credit for a conversion to a campaign - is a critical part of digital analytics. After all, digital is inherently multi-touch. And in a multi-touch environment, you can’t optimize properly unless you can assign credit accurately. That’s why nearly every digital analytics professional and every digital marketer knows that simplistic attribution methods like “last touch” don’t work. We’ve been doing quite a bit of thinking about and working on attribution in our practice recently, and before I continue my series on Personalization I wanted to capture the gist of a hot internal discussion that is interesting and important to one of the key analytics projects in digital.
Here’s a short recap of attribution methods and the current state-of-the-art.
The most common and basic method of attribution is to assume that whatever campaign a visitor last interacted with before converting is responsible for the conversion. Assigning the full credit for a conversion to this most recent campaign source is called “Last Attribution.” Some companies prefer to give full credit to the campaign that a visitor first sourced from. In other words, if a visitor sees a display ad, clicks through onto the Website, leaves, and then return by clicking through on a PPC ad, full credit would go the display ad since it “started” the conversion. These rules may be slightly complicated with the introduction of time windows. Some companies, for example, give credit to the first source within the last 30 days. Of course, if a visitor has only one campaign source, then there isn’t any attribution issue. That source gets the credit since it is first, last, middle, etc.
The flaws with both First Touch and Last Touch attribution are pretty obvious. In a multi-touch environment, we tend to think that each touch probably played some role in conversion. So if we arbitrarily give all the credit to one campaign, we’ve over-credited it and under-credited the loser. This might not matter much if the order of campaign touches was random. Given enough conversions, we’d probably expect that each campaign came first or last about equally - thereby balancing out the credit. Unfortunately, we know that’s not the case. Some types of marketing are much earlier in the customer journey than others. It’s pretty much established wisdom that, for example, branded search tends to occur very late in the funnel. This creates significant bias as some campaigns are much more likely to be the “first” or “last” touch than others.
There’s an (apparently) simple way to solve the problem, just give some credit to each touch. People have hit on four methods of doing that. The first method is to give full-credit to every touch. This makes your marketing look super-effective. However, it will almost certainly create push-back in the organization. Nobody in management wants to hear that marketing drove 10 million in sales when you only had 1 million in revenue! Just about as simple is to give equal credit to every touch but divide the credit so that it equals a single conversion or conversion value. This preserves the 1-1 relationship between campaigns and attribution while avoiding any ordering bias. Of course, not everyone wants to avoid ordering bias, which brings me to the third method: ordering weighed attribution. With ordering weighted attribution, you devise a curve that divvies up credit for a conversion based on a weighting system driven by the order of touches. It might look like this:
1 Touch: 100%
2 Touches: 60% 1st : 40% 2nd
3 Touches: 50% 1st : 30% 2nd : 20% 3rd
And so on.
Weighted attribution allows you to optimize for a preference based on original sourcing or final conversion. You can even use a U shaped method like this:
1 Touch: 100%
2 Touches : 50% : 50%
3 Touches : 40% : 20% : 40%
This method overweights the first and last touch. The drawback to weighting is that it’s arbitrary. There’s really no reason to believe that any single weighting system somehow captures accurately the right credit for any given sequence of campaigns and there’s every reason to think that the credit should vary depending on the order, time and nature of the individual campaigns.
Which brings us to the 4th method: algorithmic attribution. In algorithmic based attribution, you’re creating a statistical model of the impact of campaign presence on conversion. You have a set of records with one record for every visitor with at least one campaign touch. The campaign touches are either time or order stamped (or both). You then build a model that weights the probability of conversion when a campaign is present. Take time and order out of the problem for minute and think about the resulting analysis. In the most straightforward type of model, you’re measuring the likelihood of a conversion given the presence of a campaign. To assign credit, you then adjust the weightings so that they add up to 100% for any given stack.
This seems like a vast improvement on the essentially subjective weightings used in any of the other methods. It is, until pretty recently, the method we would have recommended for developing a robust attribution analysis.
Of course, there are more complex models possible, but it doesn’t really matter. Because no matter which method you choose, the analysis is fundamentally flawed. And it’s flawed, like most analysis problems, not because the model lacks sophistication but because the choice of variables is inappropriate.
I’m going to lay out a couple scenarios that I think illustrate exactly why this type of algorithmic attribution is wrong-headed and often worse than useless.
A company has an annualized subscription product. Typically 85% of customers renew. The company starts a new email campaign targeted to existing subscribers – these haven’t previously been marketed to. The email is sent the week before the renewal date. The vast majority of customers who get this new email campaign open the email. 85% of customers subscribe. The visitors who viewed this campaign had only a single campaign touch. Regardless of whether you use first, last, participation or algorithmic analysis, 100% of the credit for all those renewals will go the new campaign.
You’re a large dealer network in Northern California selling cars. In a customer exit survey, you find that 10% of your customers frequently use a car enthusiast site targeted toward the same region you sell in. You launch a run-of-site display campaign on the targeted property. The results are excellent. Many previously unidentified visitors to your Website had a display impression in the new campaign and the conversion rate is quite high. About half of these visitors convert without any additional touches. The other half have additional touches, especially branded search on PPC. Even after a year of display advertising on the site, in your customer exit surveys, about 10% of your customers frequently use that site. Depending on the attribution method you use, you’ll credit something between 50%-100% of these conversions to the new display campaign.
You’re an airline and you buy branded pay-per-click keywords. Though you have the top 3 organic results, you’re able to track organic clicks. With the addition of the paid ads, you see some decline in organic results but this is more than offset by the increase in total clicks. The majority of these pay-per-click keyword visits convert directly. About half had some previous campaign touch. Depending on the attribution method you use, you’ll credit something between a little less than 50% to 100% of these conversion to the branded pay-per-click campaign.
These are all actually fairly common cases; I’ve adapted the names and places, but each scenario represents a case I’ve seen many times. These aren’t edge cases. Indeed, I think they capture something that is fundamentally wrong with algorithmic attribution. Because in each case, it’s quite likely that the campaigns in question had ABSOLUTELY NO positive lift.
In Scenario #1, the absence of lift is so obvious as to not need explaining. But no statistical method that deals only with marketing touches and conversion can possibly arrive at the correct answer of zero lift. It doesn’t matter how advanced your method is, it will fail. Now a statistician will likely shrug his/her shoulders and reply – correlation isn’t causation. True, but if an attribution model is supposed to be actionable, correlation darn well better be causation. And if it isn’t, how are you supposed to make decisions on the data?
Scenario #2 is trickier. Unlike #1, this scenario will and routinely does fool people. When you find a site that’s highly complementary to yours, then placing a display ad on that site is likely to yield pretty good results. But remember, the site is complementary! You were already going to get a significant number of those folks on your Website and they were going to have a good conversion rate. That’s why you targeted the ad to the site in the first place. In Scenario #2, the % of final customers sourced from the Website never budged (it was always 10%) – but the number of customers who were attributed to the display campaign skyrocketed as soon it was placed on the complementary Website. It should be clear that this MUST happen anytime you do a run-of-site display campaign targeted to Website with significant overlap in your prospect base.
Scenario #3 is another tricky one. Many folks don’t bother to track their organic cannibalization, so they make an even more basic mistake than is illustrated here. But I’ve factored that in. In this scenario, the Branded PPC buying is net positive even when a loss in organic clicks is factored in. But suppose that about 20% of the people who type in a branded organic search then type in your URL in the address bar. Perhaps they just aren’t sure how to spell your brand or whether your Website is your full name or some common abbreviation. You’ve never accounted for direct traffic since it isn’t campaign sourced. But your branded PPC buy captures half of that direct URL traffic, making it look very successful relative to it’s cost.
In each of these Scenarios, you’re marketing is creating zero lift. But in each case, ANY attribution method that is focused on assigning credit to marketing touches based on their presence, frequency, and order will be completely incorrect.
In each case, what’s happening is that your marketing is targeting some group with a pre-existing propensity to convert and that was not previously attached to a marketing campaign. For most large brands, I think this problem is ubiquitous in a way that makes such approaches far worse than useless. Making allocation decisions based on this type of attribution is deeply misleading. In fact, you’ll almost certainly end up making your marketing spend allocations consistently worse even as all your attribution metrics appear to improve.
Great job spending all that money on a fancy attribution system.
Now take a look at this article on Vox about eBay’s Google Ad Spend.
The author is focused on the broad question of whether or not buying pay-per-click ads on Google is effective. But, to be honest, that doesn’t interest me as much since it's so specific to an individual company. I think the article is best read as a study in the failure of attribution. Think about the findings of the study and how each and every one would be challenging to unearth with algorithmic attribution. eBay (!) resorted to massive A/B tests to decide whether campaigns delivered incremental lift and the answers directly controverted nearly everything their marketing analytics had previously told them.
So is massive A/B testing a better answer than algorithmic attribution? Yes, I think it is; it’s also way more work and it’s work that never really ends.
Keep in mind that it’s not impossible to build a better model. It just means that the model has to be built differently. Instead of focusing on fancier and fancier weightings of campaign presence, the right approach is to create a two-stage model. The first stage should be focused on the pre-existing likelihood of a given type of customer converting. Once you have that model, you can now measure which campaigns deliver incremental lift when applied to those populations. This method makes it far less likely that you’ll confuse correlation with causation when it comes to marketing attribution and ensures that the results of the attribution model will be actionable.
Ultimately, this comes down to what attribution is all about.
If attribution is about grabbing credit, then by all means build your 100% attribution models and use whatever method you want to figure out how to portion the blame. It doesn’t matter which method you pick because they are all wrong.
I think attribution should be about lift not credit. Anytime you’re giving 100% credit to a marketing campaign, you’re doing something wrong. For today’s large brands, making allocation decisions based on a 100% marketing credit model is deeply and irrevocably mistaken. So the first and most fundamental step is to understand what the default conversion likelihood is for any given type of customer. That's the credit that isn't up for grabs. Then build the model of which campaigns, in which order, and in which frequency drive incremental lift. To do this right, you’re very much in the world of algorithmic attribution – weighting touches to resolve credit. But first you’ve done the critical work necessary to understand the actual impact of ALL those marketing touches on a given customer. It’s not until you understand the whole that you can credit the parts.
And the great secret of good attribution is that the whole is NOT 100% of the conversion value for almost any customer your marketing will source.