Last week I had the pleasure of listening to Eric Peterson speak not once, but twice. The first time was during a Coremetrics webinar on campaign attribution and the second later that evening at the local Web Analytics Wednesday where Eric delivered a longer presentation that included the same attribution material. And while I have a great deal of respect (and even friendship) for Eric and an equal amount of respect for Coremetrics, I feel a need to challenge the content.
For awhile I’ve been speaking about the emergence of the third generation of web analytics, as I call it. For those that haven’t heard me present this before, the first generation was characterized by IT departments measuring web site activity via software installations of log file analysis tools. The second generation was dominated by marketing departments utilizing hosted solutions and page tagging. The primary value these two generations of solutions provided were aggregate reports, along with rudimentary ad-hoc analysis capabilities (rudimentary, that is, compared to modern business intelligence systems).
Whereby the first two generations were characterized by reports, the third is certainly about the data – the open access to un-aggregated visitor detail data and the endless forms of true analysis that can be performed with it. Knowing that Coremetrics is one of a few major vendors to store un-aggregated data in an industry-standard database (along with WebTrends) I was expecting a thoughtful discourse on statistical modeling. Alas, what we were told was to utilize not one, but three flawed attribution models (last, first and equal), in hopes that three wrongs would make it right I suppose.
Since our high school statistics classes we have been taught the difference between correlation and causality. Statistics show that as ice cream sales increase, so do drowning deaths. Therefore, ice cream causes drowning, right? Of course not – it is the onset of warmer temperatures that indirectly leads to both. As trite as this example may seem, it is no different than the fallacy that a campaign’s inclusion in a visitor’s click-path prior to conversion means that it had a causal affect on the conversion, or that it belongs in our campaign portfolio. The same campaign may have been clicked on by many more non-converting visitors … at substantial expense.
True, if a visitor clicked on a campaign prior to conversion, it’s certainly more likely to have had a causal impact. That’s especially true for the last campaign. But if we’re going to finally break away from the flawed last-click attribution model, why not do it correctly? We have the data – let’s use a statistical model.
Now for the less-than mathematically savvy user of web analytics, no, this doesn’t mean your solution will be more complicated. Quite the contrary. Before credit card companies implemented mathematical models to detect fraud, we consumers would first learn of fraud only after we received our statement. And then after weeks arguing with our vendor we might have gotten the charges removed. Today we get a phone call within hours of the questionable transaction and a new card sent overnight to us, no questions asked. Math made our lives easier.
So will it be for campaign attribution. Imagine a campaign report that tells you, in a statistically valid way, which campaigns and campaign attributes actually had a positive contribution to conversion and to your campaign budget, versus those that didn’t. Then imagine that same report telling you how to improve results. I propose the following report:
Don’t sweat the details – I just punched some example data into a spreadsheet. Instead, focus on the bigger picture of having a report that shows you how your campaigns truly performed and recommends to you an adjusted mix based on the current set of campaigns. Then imagine the data for auction-based networks being automatically passed to an automated campaign optimization system. Now that would be progress towards true optimization of campaign budgets while also making the marketer’s job much easier.
Note that at the moment WebTrends doesn’t provide the above report either (but we do have the requisite data in a readily accessible format). My point is that it’s time to embrace the third generation of this industry and start truly leveraging the data in mathematically and scientifically valid ways.
P.S. Please send me your thoughts on the dream campaign report.