My last post on attribution generated quite a bit of comment. The comment thread on LinkedIn (where I’ve just recently begun posting) was particularly robust. I don’t usually have time during the week to review comments or reply, but I think there’s enough thought-provoking material in the various comment threads to warrant some additional thoughts.
In case you missed it, here’s a quick summary of my thoughts on attribution.
Essentially, the need for attribution analysis has been driven by the frequency of multi-touch behavior in an acquisition life-cycle. Traditionally, we’ve given credit to the marketing campaign that sourced a visitor who converted. In my old direct-mail world, if we bought a list and mailed a catalog, the list that sourced an order got the credit. In digital, however, it’s not unusual for a visitor to have multiple campaign touches in a very short amount of time. Indeed, techniques like re-marketing, buying brand PPC words and run-of-site display advertising virtually guarantee that visitors will have more than one marketing touch before they convert.
All of which, inevitably, raises the question of how to credit those various campaigns that touched the visitor. Traditional methods chose a single favored spot (usually last touch). But we know this isn’t accurate or even representative since repeated observation has shown that the order of campaign touches is far from random. You can split the credit equally across all touches, but it doesn’t really make much sense. You can weight credit (giving more to first and last, for example), but such weightings seem arbitrary. Or you can build a statistical model that apportions credit across all the touches based on which touches are most correlated to conversions. This last method seems preferable and, in fact, is the method that until very recently our practice would have suggested.
Having experienced some of these attribution projects, however, it’s become increasingly clear to me that this form of attribution analysis isn’t very useful and can, in fact, be very misleading. The problem is simple to state and easy to see. Giving 100% credit for a conversion to marketing is, for most large brands, absurd. The problem with advanced attribution isn’t how the credit is apportioned, it’s the idea that 100% of the credit should be apportioned to campaigns.
In my original post, I gave three real-world scenarios that illustrated why a pure algorithmic approach to attribution is deeply broken. In Scenario #1, a new email campaign is sent to customers who are about to expire their subscriptions. Traditionally, 85% of the subscribers renew. After the email, 85% still renew, but the email, as the single marketing touch, gets credit for all those previously uncredited renewals. In Scenario #2, a run-of-site display campaign is targeted toward a site that, based on research, is a favorite for prospects who convert. The campaign gets credit (as the only touch) for all the visitors who convert after going to that site and getting an impression. But the exact same number of visitors from that site convert as did before the campaign. In Scenario #3, a branded PPC campaign cannibalizes direct clickers and appears to source new visitors even though all its doing is shifting visitors from typing the URL to clicking on the PPC ad.
What unites all three scenarios is that visitors with a high-existing propensity to convert get a new marketing touch and this touch then receives credit for the conversion even when it had zero lift. It’s my contention that these scenarios are ubiquitous for large brands and make the apportionment of credit a useless, non-actionable exercise in the vast majority of cases.
I want to particularly emphasize Scenario #2 since the vast majority of attribution studies are done to justify the impact of display impressions. Without a lift-based model, these studies will nearly always reach the conclusion your agency wants, since they will give at least partial credit to display for the vast majority of your converters regardless of whether or not your display ads had any impact.
Instead, I suggested either massive testing (as per this article about eBay - and a slightly different take on the same news here) or a two-step attribution model that first assesses a prospect/customer’s likelihood to convert and measures the incremental lift that marketing actually drove. Credit is then apportioned on the lift (and only the lift) using something like a traditional attribution model.
So now to the comments/additional thoughts:
Attribution has nothing to do with Multi-Touch
Historically, as I wrote above, attribution has everything to do with multi-touch. But the entire thrust of this argument is that “last-touch” is deeply flawed even in circumstances where there is no multi-touch behavior. This runs directly counter to our own previous thinking. We used to advise companies without significant multi-touch behavior to skip attribution. Of course, that made sense as long as attribution was focused solely on apportioning credit. It would have been a waste of time, given the methods we were using, to do that analysis in a predominately single-touch environment. However, this approach suggests that attribution (measurement of lift and credit) is appropriate for every marketing environment even one’s without significant multi-touch. William Dean Donovan’s comment about negative lift is particularly on point here. It’s perfectly conceivable that in my Scenario #1, only 84% of the people converted after getting the email – meaning that it actually LOST customers compared to the baseline. In a traditional attribution world, you’d be giving gobs of credit to a campaign that was actually costing you customers!
What to do if you don’t have Gobs of Money
We mostly work at the high-end of the world with very large enterprises willing to invest a lot of money to get good answers since they are SPENDING a lot of money. But what about the rest of the world? What do you do, as Greg Moore asks, if you can’t afford eBay’s massive testing?
I’d make a couple of points here. First, there are vendors and consultants that have built whole businesses out of supplying sophisticated algorithmic attribution models (with no lift modeling) to assign credit; so quite a few companies are willing to spend money to get this wrong. It doesn’t really take more money to get it right. If, for example, you just upgraded to Adobe Premium so you could do attribution analysis, you’ve got the wherewithal and the technology to tackle this problem correctly. You just don’t have the right approach.
Second, the fundamental point here is useful even in a low-budget, sporadic testing environment. You don’t need to do a complex 2-phase attribution model to know that if you just sent a new email to potential renewing customers, that the appropriate measurement of credit is lift not conversion. It takes an advanced model to make such a stupid mistake! Once you understand that attribution should be about credit after lift, there are a myriad of easy tests and decisions that can be made that will improve your thinking compared to last touch attribution (or to any alternative weighting based on a model).
Indeed, I’m going to flat-out disagree with this comment
“This isn’t based on research...but we can split the Attribution market into several groups: one is comprised of the top global advertisers who spend millions on their campaigns. They can use an amazing solution…to maximize the efficiency of their placements, creatives and even more granular elements. For them, getting super accurate with their attribution is key. But there is also another group – probably a much bigger group than the first – that includes people who don’t fully grasp that their conversions are preceded by multi-touch interactions. For them, attribution is not about finding the perfect budget allocation balance, but rather the encouragement to look beyond the last click, and to act on the data accordingly. This can be addressed with a simple rules-based attribution tool.”
I believe the super-sophisticated group is (mostly) getting this completely wrong and wasting all that money and time. The simple guys aren’t doing any better. The problem isn’t the tools, it’s the mindset of assigning 100% credit to campaigns. Attribution based on 100% credit is stupid. Last touch, first touch, multi-touch is all pointless within that context. I know I said differently a year ago. I’ve learned. If you take one thing away from this conversation it should be this: lift first, credit second.
In today's world, if your attribution system is giving 100% of the credit for ANY conversion to a single campaign, it's almost certainly wrong. If you're system is giving 100% of the credit for any conversion to ALL of the campaign touches, it's just about as certainly wrong.
Segments of One
Michael Taylor points out one of the great challenges in proper attribution – the difficulty of testing at meaningful levels. Real performance in digital campaigns is often driven at a very small level. Those low volumes make testing a challenge. You can do deprivation testing on a whole channel, but what if only 10% of the channel drives all the lift? Deprivation testing may suggest that the channel is working even though you’re wasting 90% of your money! This isn’t farfetched. We’ve seen PPC programs where a very small part of the overall program drove virtually all of the real program lift. As you drive down levels, however, it gets harder and harder to drive statistically significant results. It also forces you into vast amounts of testing. I don’t have a magic bullet here. A model will often suffer from the same problems that a test will. I will say that the 2-step model I’ve suggested is, at least, a way to tackle the broad problem without an insane amount of testing. It won’t solve every problem and the small sample-size problem will still be with us.
Just something to be aware of.
Thinking about Targeting Propensity
Ben Sidebottom’s comment is interesting and astute:
“Incrementality is critical when comparing media that was targeted to audiences with differing baseline propensity to convert. If media was mostly targeted to audiences with similar baseline propensity to convert, then you can still make good decisions optimising across just those placements as the correlation/causality gap would be mostly uniform across those placements.”
This gets at the gist of the argument and says it very well indeed. However, I’m skeptical that, in the absence of a lot of work (work that isn’t done in a traditional attribution analysis), we know when we’ve targeted audiences with a similar propensity to convert. In my three Scenarios, only #1 stands out as a clear case of knowing that we’ve targeted a segment with a different propensity to convert. In the other two scenarios, it’s not clear that we’ve targeted customers with a different propensity to convert and, in fact, in scenario #2, the whole point of the display site targeting is to try and find those places where there is a higher propensity to convert. I just don't think this rescues traditional attribution analysis.
It doesn't seem to me that there are large realms where pure algorithmic attribution is appropriate and that if, for example, you don’t compare customer to prospect campaigns you will somehow be okay. Even in those areas where you think, a priori, that you’re targeting prospects with a similar propensity to convert, there’s a better than even chance that you’re not.
Media Mix Modeling
There’s been a long, running battle between proponents of Media Mix vs. Attribution as a method of optimization and that can also spill over into debate between testing and attribution modeling. Here’s Peter Ahl’s thoughtful comment on this debate:
“A/B testing in the context of measuring lift in either Customer Acquisition or Customer Retention is essentially Media Mix Modelling (MMM). We can do this backward-looking from the natural variability in advertising spend across channels. Over time new channels have been added, others discarded. We can do this forward-looking by planning experiments, by increasing/lowering spend in each advertising channel over time. But then, it is not only advertising (reach and frequency) that impacts the response rate; price promotions and pricing strategy, conversion rate optimisation, messaging (creative), the competitive landscape - to only name a few other explanatory variables, are also contributors. It is fair to say though, that flawed measurement is better than no measurement. Media Mix Modelling is the measurement of 'lift', as opposed to 'measurement of level' - which is what marketing attribution in all guises do. Attribution can be thought of as an explanatory model. A predictive model needs to be built from (natural of pre-conceived) variability in the explanatory variables - call it A/B testing or MMM.”
On the whole, I too come down on the side of MMM vs. traditional attribution. But, as these posts should make clear, that’s largely because I have very little confidence in traditional attribution. I also believe that it will prove impossible to answer some questions around appropriate spend except by careful testing. The world is too complex and our data too scarce to allow modeling to achieve the full right answer.
However, the 2-stage model I’ve suggested is, properly speaking, not a Mix model. It’s built differently and uses different types of data. An MMM model doesn’t need to know anything about Prospects or Customers. It works simply by measuring variation in inputs to variation in outputs. That’s not what my two-phase attribution model approach does at all. Instead, I think it’s a way to do better attribution modeling – one that incorporates a model of baseline lift to move attribution modeling into the territory of MMM but at a much finer-grained level of detail and actionability.
I've always said that Attribution Modeling and MMM get at rather different problems. I still believe that, but the approach I'm suggesting here narrows the differences and greatly extends the range of practical problems that an attribution analysis can address.
Whew...even with all this I haven't addressed all the comments, though I've tried to cull out the ones that really struck me. It's great to have so much thoughtful and interesting response to react to. Hope this 2nd addition on attribution adds value as well!