After posting about doing some testing on my Facebook status (I don’t use twitter) I got a couple of queries about what I was up to, my current project is a bit hush hush so I thought I would introduce this case study. I have also provided what I hope are some handy hints dotted through out.
Affiliates are great no seriously they are the online sellers best asset without them many of the big stores would barely make a penny. So its strange how so many of them take their affiliates for granted treating them as one homogeneous block. We were asked to do some work for a client and I thought I would share some of the theory behind it (warning this post contains logic and a sprinkling of basic math) even though its not strictly an SEO post many SEOs are asked to perform this sort of statistical analysis every day.
Company A sells a diverse range of products it has a large affiliate base and are keen on their affiliates doing the leg work. Their philosophy is to help the affiliates along as much as possible. While they don’t have a dedicated affiliate landing page (they prefer to encourage deep linking to specific products) their “special offer” pages are used within PPC campaigns and as a focal point for affiliates activities. These pages are regularly tested however traditional split testing has caused a problem. Several of their high earning affiliates have low conversion rates on these pages while the PPC campaign is doing well and according to traditional split testers this page is the best performing of all recipes that had been presented over a large dataset.
The traditional solution
We have numerous traffic sources all arriving at one area the traffic is bound to be looking for different things and at different points along the buying process it makes sense therefore to simply split this traffic into groups and optimise each group. So we now have 3 separate experiments running each with the same recipes. The results come in the PPC is up as is the misc organic traffic but conversions while up slightly are still disappointing for the affiliates, time to investigate whats going on.
Eccomerce sites have a wealth of information about their users, their buying practices and with a little bit of data mining we can create a profile of shoppers (we have 12 in this study) on the site. When we match that with where they have come from we can begin to organise what sort of user is coming from our various traffic sources. Obviously there is crossover and sources such as PPC and Organic traffic should produce a diverse set of profiles as will affiliates on the first glance it is only when you break down your affiliate traffic by affiliate that the number of profiles reduces.
Since I really wasn’t interested in Affiliates making a sale a year I quickly discarded a few thousand records and associated bits to a core group of 500 or so, each ones sales patterns were looked at and users profiled. Interestingly the top 5 affiliates had far less cross over in profiles then those earning a little less. This points to the fact that they are using distinct niches and they were tending to not step on each others toes. The next step was to compare these profiles to our conversion information for our landing pages were they way outside of the normal profiles?
Quick tip – Profiling and data mining is one of the key elements to eccomerce that even some of the big boys can’t manage to get right! Take amazon as an example of company with a flawed profile and data mining set. While its clear they are profiling users they are not cross referencing with previous purchases causing them to send emails suggesting you purchase books you already bought from them! If your interested in profiling and customer data mining I strongly recommend Scoring Points: How Tesco Continues to Win Customer Loyalty its a good read over the Christmas break.
Our best performing profiles on landing pages were 3 distinct profiles to keep things simple we shall call them the bargain hunter, the infrequent and the impulse. In reality each of these had 2 or more subset profiles but we can quickly see the special offers landing page are converting people who are looking for bargains, who don’t spend often with the store and those whose shopping patterns are erratic but often put more then one thing in the basket before checkout. Now this is not surprising but its always nice to have stats back up common sense particularly as its not always the case! Our top affiliates users profiles were on the whole not part of our profiling grouping of high conversions though there was some crossover with impulse profile amongst them. Interesting affiliates were producing the highest returning customer group even from this pages (indicating even if they didn’t buy the deal they were remembering the site) . Through out all of this one affiliate was standing out his conversion rate was close to 25% while his traffic stats were similar to the rest of the top affiliates these figures were so high they were hard to believe and our client was suspicious something was afoot. His traffic was almost entirely a single profile and were purchasing specific products and only those products for our client his users were infuriating not responding to future mailings, rarely returning accept through his link to specific products something was clearly going on, could he be part of some secret shopping sect? Is he brainwashing? Or was something darker going on?
Investigating an Affiliate
The vast majority of affiliates get on with their affiliate managers its a symbiotic relationship one can’t exist without the other. Therefore most affiliates don’t mind when their manager calls them for a chat often its going to be over some offer or a rise in commission when someone refuses to communicate it signals if nothing else a deterioration in their relationship. When we were unable to contact this particular affiliate it seemed that we might have a bad egg it was time to unravel his traffic sources. Like most affiliates he used a couple of central redirects in an attempt to both hide the real source but also for stats gathering, a whois and DNS lookup allowed us to find the fact both of these were on a single server and after a few minutes we had a 122 domains mainly .infos to look through passing those through our link mapping software produced nearly 400k backlinks across the network, nice thats a weekend of fun. Thankfully manually scanning through such a large dataset is not really feasible. Taking a small sample of the 122 indicated most could be considered as “made for adsense” style sites scraped or just crud content with links pointing to other parts of the network and a pile of ads. Thankfully the link mapping also provided 6 domains (all hosted on separate servers) where the network was pointing to. So far we still have a suspicion and some dodgey link building methods certainly nothing to damn the affiliate we now also have the main sites he has been promoting.
Quick tip – People tend to think micro instead of macro just because you made your site look less spammy on a human glance doesn’t help if your linking into a large network. Companies and search engines look for large scale patterns and work backwards which is why paid links from brokers are so easy to identify if you have a large enough data set.
6 sites some interesting cloaking to provide overly keyword stuff context to Google but on the whole these sites have good landing pages, they are auto generated and producing coupons for every product to get discount. They are also incentivising the offers through a ponzi scheme (the cheeky bugger!) basically incentives were only being issued if a user not only bought the product with the coupon and then recruited 3 other users who did the same. The whole thing was financed through the fact that the affiliate was earning a standard commission regardless of the coupon and so was using the difference to payout in the unlikely case he needed to, I have no doubt when you joined “the club” you would be put on a mailing list and treated to a bombarding of mailshots. The system was ingenious but totally against our clients TOS and depending exactly how he was funding the payouts possibly illegal in the UK. So having ruled out our super affiliate as being the answer we handed over our findings to client for them to deal with and went back to the original question.
With our anomaly ruled out as a way to improve our affiliate conversions it was time to start gathering information on the presale our affiliates were using, this occurred more or less the same way we investigated our first affiliate, in addition each affiliate was sent a very quick questionnaire and a couple of follow up conversations with several of the affiliates and we probably knew more about our clients affiliates then they did!
Our clients affiliates ran into four main groups, the lander, the blogger, the comparer and the mailing list guru. The landers are affiliates using lots of mini sites to push a certain product often very much into PPC and SEO they push very targeted products. The bloggers run blogs in our clients case they were mainly gadget and sport blogs they would be targeting individual products but would often reuse our clients site over and over again. They were the group with the most residual traffic after a launch. The comparison websites are big business for affiliates and with some studies showing 1 in 4 people now using a comparison site for finding online purchases its no surprise that 2 out of our clients top 5 affiliates were primarily using comparison sites to drive traffic. Finally the mailing list such a powerful tool in Internet Marketing was actually the least used but as our client mentioned when we started they always knew when 1 of their affiliates released his newsletter by a small spike, the downside this affiliate brought almost no long term traffic.
Quick Tip – Its interesting to note most big affiliate marketers while they might have fingers in many pies tend to have one major traffic source type even if they have multiple sites and lists. However the top affiliate in this study (ignoring the club owner) used a blog, mailing list combination which in turn had comparison charts.
With our affiliates traffic sources and profiles in mind we went back to the split testing to provide a unique and adapting page for every affiliate was not practical so instead we took our 4 traffic source groups and then cross referenced their user profiles the result was 7 diverse profile groups each was then run in an experiment of their own with the original recipes. Results were pleasing with all 7 experiments seeing a large leap in conversions it was interesting that out of the 7 experiments none were exact matches with each other yet our comparison website profile experiment matched the PPC.
Explanation – Ok someone is bound to ask, if we have experiments which rely on profile information and we don’t have any user data how are we assigning a user profile prior to gathering the information? The answer is we or rather our testing software makes an educated guess, we then refine the results later and push it back into the correct group as a corrupt test. The actual guess is generated using a genetic algorithm with demographic info we do have available introduced we then use that demographic data as a means to cause mutation within our default population, to provide an initial population in both experiments we include two or more mutators (profile indicators) and we can feedback information from successful sales the result should introduce two or more variants once we have enough feedback information we can remove or scale back our initial mutators.
Alternate Approach – Evolution
While we do rely on the use of genetic algorithm math to make educated guesses about profiling it can actually be used to generate the entire landing page profiling as an alternative to our experimentation. Rather then considering our source as the population we would consider the page to be the population with our testing variables individuals since genetic algorithms are ideally suited to provide a positive or negative result of an individual within a set we can use the same math to “breed” our individual (testing variables) to identify strongest matching pairs which can in turn be tested against one or more pairs or individuals. Their have been several interesting experiments on this type of split testing and its something we have found works well on large data sets. However the additional processing power may not justify its use unless the results show a marked improvement in conversions.
Quick Tip – A lot of Internet marketers will refer to a statistic technique called Taguchi Method indeed many see it as a holy grail as it allows you to run experiments with what would be traditionally considered small data sets and return results similar to large scale tests. Don’t be fooled down this route if your dataset is not large enough to be running large multivariate tests then concentrate on smaller A/B split testing and looking to increase your dataset size any gains in using statistical analysis methods will be minimal at this size range anyway.
Quick Tip – Statistical data collected is temporal it is effected by time an obvious example is a searcher looking for Funny Christmas Hats today (mid December) is likely to be a different profile to one looking in January the search maybe the same as perhaps the intention but chances are the result is not.
Affiliates are not a single group
The client site now when running a special offer page has 9 experiments running on it reflecting 12 profile groups and 6 distinct traffic sources. The alternate without analysing both the profiles and traffic sources would have been to assume that each traffic source was capable of producing all 12 profiles (which they are but in most cases statistical negligible) resulting in 72 separate experiments. Without profiling we would never have been able to identify personality groups and would have relied on traffic and guess work the key is to remember that testing is only useful if you know what variables are actually controlling the test.
I hope this little study has helped people to see the potential of mining your data and that of your affiliates but its really be very much theory based so here are some handy and a bit more practical tips.
You cannot know your customers individually and eventually you have to accept that you are going to have to start grouping them. Using their spending habits along with other demographic information you should be able to split your users into between 6 to 18 distinct groups. Once you have your profile groups you can target and market to these groups.
Pool your data externally
Most sites do not have all their customer data in one place, apart from order and purchase logs, stats will be held elsewhere and questionnaire data in yet another location. Keeping in mind the concept of temporal data it is far easier to take snap shots of your data sources and pool them for analysis while most work is automated initial analysis and some interim work still needs to be compiled so having a stats package handy is always good. While I know most can’t justify a copy of Crystal, Matlab or SPSS there are free alternatives that will work well for you such as the open source PSPP
Quick Tip – Make sure your temporal data is in the same timezones, particularly for non US users who servers are in the US.
Is it a gift?
During the purchasing process its a great idea to understand the reason for purchase this simple easy to answer question allows us to discard the user within our normal product profiling, whats more it allows a simple upsell for adding wrapping and alternate delivery at a small cost.
Interrogate your data not your users
Questionnaires are a great source of information though most opinion is subjective the demographic data is not but people who answer such questions will lean into one or more profile groups. When asking questions to users keep it short and sweet you are looking for ways to “enhance” their future experience not put them off ever shopping with you again.
Think Tescos – Unique Coupons
If I have a set of profiles which are unique from purchase history and I have a users purchase history it should be easy enough to provide a unique coupon designed to encourage them back to the store. Tesco are experts at this through their clubcard but very few online retailers use targeted coupons mainly because of delivery methods. But by providing a monthly newsletter with generic material chosen based on purchase history with unique coupons based on purchase history and profile a user can be tempted back.
A simple example 2 users both interested in computers both purchased several items in the past. Customer A is an impulse buyer – his coupon is a discount % of a single product the discount is substantial but the site makes money on the chance that he will add more items to the trolley at the full price. Customer B is a bargain hunter he is presented with a % of all items which is significantly lower then customers A percentage off. However customer B is likely to purchase only a couple of items and so the store wishes to retain some markup on the items.
The downside to this method is the sheer amount of data that is required plus maintaining a high delivery method (assuming email this means running your own mailing list system not using a 3rd party like Aweber)
What methods do you use for profiling users? Do you think we can pigeon hole customers into 6-18 little groups?