<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Tim Nash &#34;stuff&#34; Blog &#187; Advanced SEO</title>
	<atom:link href="http://www.timnash.co.uk/category/advanced-seo/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.timnash.co.uk</link>
	<description>The Stuff Consultant</description>
	<lastBuildDate>Tue, 07 May 2013 21:19:47 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Should Opinion Be a Ranking Factor?</title>
		<link>http://www.timnash.co.uk/05/2010/should-opinion-be-a-ranking-factor/</link>
		<comments>http://www.timnash.co.uk/05/2010/should-opinion-be-a-ranking-factor/#comments</comments>
		<pubDate>Sat, 15 May 2010 13:32:51 +0000</pubDate>
		<dc:creator>Tim Nash</dc:creator>
				<category><![CDATA[Advanced SEO]]></category>
		<category><![CDATA[Search Marketing]]></category>

		<guid isPermaLink="false">http://www.timnash.co.uk/?p=597</guid>
		<description><![CDATA[Tim shows how he added the ability to determine if an article is fact or opinion and integrated into a clients internal search system. he then ponders a larger question should opinions be weighted above factual information. Should opinion be a ranking factor?]]></description>
				<content:encoded><![CDATA[<p>I have just finished an interesting project for a client. They run a network of news and analysis sites and have a custom search engine. One of the complaints about the search was that news and factual information was being missed because of a wealth of opinion and editorials. I was brought in to try and come up with a way of identifying if an article was factual, opinion driven or editorial (factual that has an opinion), and to provide a ranking factor for the three. The idea was that a full-on opinionated rant should appear lower in the results than a factual news story.</p>
<p>Now, there is no such thing as un-opinionated article. Everyone will have a view, even if they try to not be biased, so the first and obvious question is: When should opinion count?</p>
<h3>What Is an Opinion?</h3>
<p>Well, a quick visit to dictionary.com brought 2 out 7 definitions for &#8216;opinion&#8217; worth considering:</p>
<ol>
<li>a belief or judgment that rests on grounds insufficient to produce complete certainty.</li>
<li>a personal view, attitude, or appraisal.</li>
</ol>
<p>The words I think are important are judgment, view and appraisal. These lead us to right, wrong, negative and positive, so we could perhaps say an opinion, in context of a written prose, is something that is skewed towards negative or positive wordings.</p>
<blockquote><p>Tim is an utterly amazing and cool guy. I think everyone should follow him on <a href="http://twitter.com/tnash">twitter</a> because he is awesome.</p></blockquote>
<p>Can be safely assumed to be an opinion, as could</p>
<blockquote><p>Tim is a miserable, boring, depressing person no one should follow!</p></blockquote>
<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="480" height="385" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/ujUQn0HhGEk&amp;hl=en_US&amp;fs=1&amp;rel=0" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="480" height="385" src="http://www.youtube.com/v/ujUQn0HhGEk&amp;hl=en_US&amp;fs=1&amp;rel=0" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<blockquote><p>But it’s not long before Storm gets started:<br />
“You can’t know anything,<br />
Knowledge is merely opinion”<br />
She opines, over her Cabernet Sauvignon<br />
Vis a vis<br />
Some unhippily<br />
Empirical comment by me</p>
<p>“Not a good start” I think&#8230;.<br />
I resist the urge to ask Storm<br />
Whether knowledge is so loose-weave<br />
Of a morning<br />
When deciding whether to leave<br />
Her apartment by the front door<br />
Or a window on the second floor.</p></blockquote>
<p><em>Tim Minchin &#8211; Storm </em></p>
<p>But what about:</p>
<blockquote><p>Placing the L45 into Warp drive mode will cause it to fail resulting in a negative feedback in the pinky.</p></blockquote>
<p>Here is a something that, while negative words are used, could not be described as an opinion, assuming the warp drive does indeed fail in those circumstances. Still, if we assume an opinionated piece will have a higher concentration of negative or positive words per word count, we can create a simple rule for determining if text is opinionated or not.</p>
<h3>How To Determine Positive or Negativity</h3>
<p>Let&#8217;s face it, certain words are positive and others are negative. Actually, as humans, we can quickly work out which are which easily enough. Computers, however, need a little more help, so we have to provide them with a set of words and identify if they are positive or negative. Then, we simply process through them and work out if the line is positive, negative, or neutral on a line by line basis. Simples.</p>
<p>Ok, so to do this, we could simply bucket count, but that&#8217;s not very efficient. Instead, we&#8217;ll use the Bayes Theorem :<br />
<em></em></p>
<p><em>P(A|B) = P(B|A)*P(A) / P(B)</em></p>
<p>Where P is Probability. Where A and B are events and A|B is where event A occurs if B is true.<br />
So, Bayes theorem is calculating the probability that &#8220;A&#8221; is true or will be true, given a certain set of circumstances &#8220;B&#8221;. For more information, see <a href="http://www.statgun.com/research/bayes-law.html">Bayes Law in Plain English</a>.</p>
<p>With this math under our belt, we can construct a set of Baysian classifiers. These will sound familiar because they are what a lot of anti spam filters use. In our case, they allow us to process through and look for negative and positive words and phrases. With the classifier in place and a suitable set of positive and negative word lists, we can begin processing through the documents.</p>
<p>While the client was only interested in opinions, we actually stored the total number of sentences with negative or positive opinions. The code for our initial version was heavily influenced by <a href="http://phpir.com/bayesian-opinion-mining">Baysian opinion mining</a> code on PHPIR. However, we did end up rewiting the code in C++ as a php module to speed up the results for our large data sets. In addition, we took a similar method to <a href="http://darkoromanov.wordpress.com/2010/03/08/improved-bayesian-opinion-mining/">Darko Romanov</a> and filtered the stop words from our sentences.</p>
<p>Once we processed through document, sentence by sentence, we determined an overall score. This score was then used, along with the total number of sentences, to determine how opinionated a piece was. We also showed if it had a positive or negative bias and compared it to 100 examples of what was deemed to be opinionated and 100 that were not.</p>
<p>The system correctly identified the one hundred opinionated pieces, but also incorrectly identified 12 of the non opinionated pieces. Tweaking of the Lexicon reduced this down to 4, which was deemed a reasonable error margin. However, it also let 1 opinionated piece through. Again, this was deemed acceptable. The goal now is to provide multiple lexicons dependent on the site and author who is writing the piece.</p>
<h3>Does It Cope With Sarcasm?</h3>
<p>Surprisingly well! Most sarcasm is used when a positive indicator is used. When in fact a negative is inferred, most sarcasm is surrounded by other negative sentences when the system breaks down content sentence by sentence. Thus, while the sarcastic sentence itself will indeed be misclassified, the surrounding sentences will not (hopefully), and so there would be more negative then positive sentences within the piece.</p>
<h3>What Other Uses In Search</h3>
<p>Well, obviously, in the original client request, they were looking at removing or reducing relevancy of opinionated pieces, but imagine if you did the opposite. Let&#8217;s say I run a review website that people can use to search for product reviews. Obviously, I link to products with affiliate links.</p>
<p>Now, imagine if my internal search was designed to show more positive results for higher commission or converting items? Negative reviews would be found further down the list of product reviews. Sneaky, but if you have lots of people using your internal search it could be one way to increase revenue.</p>
<h3>Should it be a ranking factor?</h3>
<p>Well, I suspect many site owners of products may wish it was a factor. Take, for example, my post on <a href="http://www.timnash.co.uk/05/2010/gopark-co-uk/">GoParks</a> recently. The owner of the site would probably be quite keen if opinion pieces had reduced rankings. I can also see for Google News or similar this style of ranking could be useful, but on the main stream web I&#8217;m not convinced that it would work.</p>
<p>What do you think, should opinion be a ranking factor?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.timnash.co.uk/05/2010/should-opinion-be-a-ranking-factor/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Centralising your Analytics in a decentralised way</title>
		<link>http://www.timnash.co.uk/03/2010/centralizing-analytics/</link>
		<comments>http://www.timnash.co.uk/03/2010/centralizing-analytics/#comments</comments>
		<pubDate>Mon, 29 Mar 2010 11:14:37 +0000</pubDate>
		<dc:creator>Tim Nash</dc:creator>
				<category><![CDATA[Advanced SEO]]></category>
		<category><![CDATA[Behaviour modelling]]></category>
		<category><![CDATA[Case Studies]]></category>

		<guid isPermaLink="false">http://www.timnash.co.uk/?p=517</guid>
		<description><![CDATA[While getting ready for timnash.co.uk 3.0 Tim tries to tackle the problem of following a user through out his various stats packages and various methods for coping with all this decentralised data.]]></description>
				<content:encoded><![CDATA[<p>I am working on timnash.co.uk v3.0 oh yeah!! As part of the new site I will be using it as a place to run more experimental Behaviour Modelling and analytical bits, more importantly I want to make it easy for people to see what I&#8217;m gathering and what I&#8217;m doing with it. I haven&#8217;t worked out entirely how I&#8217;m going to do that yet so stay tuned. However one of the things I have been pondering is how I am going to combine my disparate stats gathering system.</p>
<p>Currently I run:</p>
<ul>
<li>Google Analytics</li>
<li>Google Weboptimiser and other A/B testing</li>
<li>GetClicky</li>
<li>Heatmap software</li>
<li>Occasional CSS History profiler</li>
<li>Surveys</li>
</ul>
<p>If I want to track a user across all these currently I can&#8217;t for example if I want to see the clickmap of a user I can&#8217;t compare it with Google Analytical data for example. This is fine but the more work I do the more I want to be able to follow a user through the entire experience, now at this stage many people may start to think ok, reduce the number of third party software and this thought has occured to me. The reason I use getclicky and Google Analytics is I can&#8217;t do better it&#8217;s that simple.</p>
<h3>Privacy Concerns</h3>
<p>The biggest issue when linking multiple systems together is the inevitable extra privacy issues, while these systems are separate they are psuedo anonymous combining them makes it much easier to identify a user especially when linked with a login/commenting system where they have to give their email and other information like name. However in many ways I think centralising  your data makes dealing with concerns easier to deal with for example you can set up a single &#8220;remove me from your tracking&#8221; service (also you can track how many people have opt&#8217;d out! oh wait is that wrong?) so centralising my data not only will make things easier for me it will make it easier for visitors who have privacy concerns.</p>
<h3>Central storage area</h3>
<p>The obvious way to centralise all the data is to create a central storage repository and put data in it, of course this immediately prevents several obvious problems.</p>
<ul>
<li>Replication</li>
<li>Single Point of Failure</li>
<li>Reliability</li>
</ul>
<p><em>Replication</em> &#8211; There is rarely a good reason in life to have two <em>working copies</em> of something, your analytics data included, apart from the fact you have to maintain both copies you also have to check data integrity and it&#8217;s taking up space and therefore costing more to store.</p>
<p><em>Single Point of Failure</em> &#8211; while not normally a problem, when something is being continually used both for read write it&#8221;s life expectancy is limited made worse by the fact that several parts of the site will be reliant on the system to make choices, if the system falls over or worse is just slow it will cause issues throughout the site. </p>
<p><em>Reliability</em> &#8211; One of the reasons to use third party services is so I don&#8217;t have to handle such things as uptime and reliability any benefit in getting someone else to do the work is lost if I then redo it.</p>
<p>the advantage is speed and as long as it&#8217;s up we should be able to access everything instantly.</p>
<h3>Decentralised with key link</h3>
<p>The second approach to look at it is linking all the various services with a common key. Most third party services worth anything will allow you to store a custom value against a visitor. If the same custom value is used per visitor for all the services then they can be tracked through various calls to each services API. This is easier said then done&#8230;</p>
<p>A couple of problems that immediately come to mind</p>
<ul>
<li>Identifying the unique visitor</li>
<li>Linking a visitor after the fact</li>
<li>What controls the initial identification </li>
</ul>
<p>It also has the potential for a single point of failure of the totally centralised solution, the service that tags visitors is down the data is lost. This however seems a much smaller risk, at worse some visitors are not tagged correctly and it probably means the site has far  worse problems!</p>
<p><em>Identifying the unique visitor</em> &#8211; This at first glance seems easy but to be accurate is actually more difficult and is a post in it&#8217;s own right. Once identified the next problem is choosing a naming strategy for a visitor Id if we had a centralised relational database this would be easy it would be the id of the row but we don&#8217;t. Some ideas I played with was timestamp, IP and profile type or some combination of these. </p>
<p>Once the ID of the unique user is set and stored on their machine either through a session, cookie or some more hardy persist storage they can be simply picked up in the future.</p>
<p><em>Linking to a user after the fact</em> &#8211; There can be times where a user maybe identified after a service has stored data about the individual some systems will automatically tie in the old data with the new, others won&#8217;t unfortunately there is not much you can do barring a recursive check and additions. For example let&#8217;s assume a user visits a site on a laptop from home, then visits at work. We treat his work log in as a different instance, when he logs in, we can identify this new visitor under the same user. However we have already sent a pile of custom keys to all our analytical packages.</p>
<p><em>What controls the initial identification</em> &#8211; here is a more tricky issue in the scenario to my blog, a simple wordpress plugin that checks to see if a persistent storage or cookie is on the users machine, determines ID and adds a cookie as needed.</p>
<p>so two competing systems both with problems the solution seems to be a blend between the two.</p>
<h3>Decentralised in a Centralised way</h3>
<p>I&#8217;m going to run through two examples of the way I&#8217;m going to centralise my data, one for here timnash.co.uk and the other for a membership site.</p>
<p>On timnash.co.uk I&#8217;m going for a totally decentralised approach, a wordpress plugin, will identify users based on if they have been tagged before as I have no easy way to identify if they are previous user on a different browser machine, except if they comment there is no major advantage of maintaining any form of database control. Users will be tagged with a combination of timestamp+profileid+random number<br />
This is then included as custom data to all the stats gathering packages and stored on the users machine using browsers persistent storage. If a user wishes me not to collate individual data they can opt out via the privacy page, this will place a persistent storage cookie, telling the system to not attach the key to their pages, to opt out entirely they will still need to individual drop out of each service.</p>
<p>For a Membership I run I plan a similar system however as it has a login system, individual browser profiles (unique keys) will be stored against a logged in user. This will allow these profiles to be linked via the username and has the advantage of spotting password sharers if their are a large quantity of browser combinations (it should be able to detect even if users use proxies or are on a corporate network)</p>
<p>so that&#8217;s the plan, anyone see any major issues with it? let me know, ideally before I fully build it! How are you managing your various data services?</p>
<div id="vs-message">
<strong>Consulting</strong><br /> <br />
Looking to develop a similar system or interested in doing detailed tracking and profiling of users? Why not come and have a chat and see what I can do for you! For more details please <a href="http://www.timnash.co.uk/contact/" >contact me</a> or look on my <a href="http://www.timnash.co.uk/consulting/" >consulting services</a>.</div>
]]></content:encoded>
			<wfw:commentRss>http://www.timnash.co.uk/03/2010/centralizing-analytics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google obeying external REP requests?</title>
		<link>http://www.timnash.co.uk/09/2009/google-obeying-external-rep-requests/</link>
		<comments>http://www.timnash.co.uk/09/2009/google-obeying-external-rep-requests/#comments</comments>
		<pubDate>Fri, 25 Sep 2009 11:59:47 +0000</pubDate>
		<dc:creator>Tim Nash</dc:creator>
				<category><![CDATA[Advanced SEO]]></category>
		<category><![CDATA[Case Studies]]></category>
		<category><![CDATA[SEO Introduction]]></category>

		<guid isPermaLink="false">http://www.timnash.co.uk/?p=364</guid>
		<description><![CDATA[Does Google Crawler actually check the http status codes of the robots.txt and obey them or does it just behave strangely.]]></description>
				<content:encoded><![CDATA[<p>Yesterday one of the <a href="http://www.davidnaylor.co.uk/dangers-of-custom-shortened-urls.html">Bronco team</a> wrote an interesting post on the fact Google Crawler was possibly following 301 to Robots.txt file even if it was on a separate domain!</p>
<p>At the time I first double checked our own bots don&#8217;t do anything quite so stupid before suggesting that I thought it unlikely but would happily test it. Dave suggested a wager oddly enough one I never took him up on and I&#8217;m glad I didn&#8217;t!</p>
<h3>How we crawl robots.txt file</h3>
<p>The need for speed is paramount when crawling a site, a bot is taking up server resources and you want it to complete its required action in as short a point as possible. If your bot follows REP (what&#8217;s REP <a href="#post_notes">See Post notes for details</a> )it&#8217;s first action should be to download the available robots.txt file, on average these files are 2-4kb in size very small and take no time to download, however a file sent with a 404 is closer to 22-40kb assuming it also sends the associated html. a much larger size given the majority of sites do not have a robots.txt file this means if you are not careful your robot will spend more time downloading a useless file then anything else. The method we use is to simply ask initially for packets if the return is a status 200 we proceed to download the file, anything else and the status is stored and is ignored.</p>
<h3>Is the way you do your crawler the correct way Tim?</h3>
<p>There is no &#8220;official&#8221; recommendation within the RFC governing REP that covers how you should treat status codes and which you should follow to only follow Status 200 is by far the most efficient method but it comes at a cost as you could be ignoring the file! It also doesn&#8217;t totally protect against downloading 404 pages as some servers send out a status 200 not 404 when a page can not be found.<br />
A draft proprosal did suggest that other status codes should be followed including 3xx related to moved documents temporary or permanent it did not explicitly mention dealing with cross domains.<br />
I have started to make changes to our own bots <a href="#post_notes">See Post notes for details</a></p>
<h2>How does Google deal with cross domain 301 of a robots.txt file?</h2>
<p>It reads the file at least according to webmaster tools, in Bronco <a href="http://www.davidnaylor.co.uk/dont-make-the-same-mistakes-as-bit-ly-and-tr-im.html">follow up post</a> they show Google Webmaster tools accepting bit.ly/robots.txt file nice should we be alarmed potentially though only if your allowing custom urls to your user at a root level on your domain with dots in them so if your running a URL shortener then yes perhaps something to check.</p>
<h3>Did you do your own tests?</h3>
<p>yes I had already done some tests last night which backed up what they did here is how I tested.</p>
<p><strong>Experiment 1 Cross 301 oh please say this doesn&#8217;t work!</strong><br />
We created two domains domain A and domain B with a robots.txt on Domain A and 301 to a file on Domain B the robots.txt dissallowed access to /test/ folder, a test folder was put on both domains and index file was put in both, each domain was given a root index cross referencing each and each of the test files.</p>
<p>If Google crawled the robots.txt then Domain A should have 1 indexed page, Domain B 2 when finished.<br />
with a monitor attached to the logs doing reverse DNS looking for a Google IP so we could watch the interaction some links were thrown at Domain A.</p>
<p>Result: <strong>Domain A</strong> &#8211; <em>1 page indexed</em>,<strong> Domain B</strong> &#8211; <em>2 page Indexed</em></p>
<p>In Webmaster Tools a <strong>status 200</strong></p>
<p><strong>Experiment 2 &#8211; Let&#8217;s give google the benefit of the doubt</strong><br />
Ok so maybe they have indeed adopted the 1997 draft and are therefore obeying redirects it will ignore a Status 666 right?<br />
Fresh domain this time our robots.txt file will be in the correct location but will send a http status of 666</p>
<p>Result: <strong>Domain A</strong> &#8211;  <em>indexed 1 page</em></p>
<p>In Webmaster Tools &#8211; <strong>status 200</strong></p>
<p><strong>Experiment 3 &#8211; given you a robots.txt file regardless</strong><br />
Ok so what if we tell you our server is broken i.e 500 but we give you a correct robots.txt file?<br />
Fresh domain, correct location but headers sent are http 503 &#8211; Service Unavailable we are telling it we are not available the server is buggered in effect.</p>
<p>Result: <strong>Domain A</strong> &#8211;  <em>indexed 1 page</em></p>
<p>In Webmaster Tools &#8211; <strong>status 200</strong></p>
<blockquote><p><em>Tim</em> &#8211; If you think about this it actually supports the belief google have actually programmed in the ability to follow 3xx as otherwise it would have for the 3xx returned a 404 or a 200 and blank file</p></blockquote>
<p><strong>Experiment 4 &#8211; I&#8217;m not here even though I&#8217;m here</strong><br />
Final test send http status 404 but also a valid robots.txt file what you going to do Google!</p>
<p>Result: <strong>Domain A</strong> &#8211;  <em>indexed 2 page</em></p>
<p>In Webmaster Tools &#8211; <strong>status 404</strong></p>
<p>Only in the final test did Google behave as if it was paying the blindest notice to http status codes, can we assume 404 is hard coded and it will accept anything else?</p>
<h3>Why should you care?</h3>
<p>While the potential for abuse is small unless you run something akin to a URL shortner what happens when your site is producing an intermittent 500 error. From playing with status codes it would seem Google if shown a invalid or unreachable robots.txt will continue to use the old file could this be a potential for abuse what about a sneaky redirect only google on a 301 from your robots.txt by a very mischievous hacker. food for thought, and I&#8217;m glad I didn&#8217;t take that bet of with DaveN.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.timnash.co.uk/09/2009/google-obeying-external-rep-requests/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Behaviour Modelling Seminar</title>
		<link>http://www.timnash.co.uk/09/2009/behaviour-modelling-seminar/</link>
		<comments>http://www.timnash.co.uk/09/2009/behaviour-modelling-seminar/#comments</comments>
		<pubDate>Tue, 01 Sep 2009 09:25:32 +0000</pubDate>
		<dc:creator>Tim Nash</dc:creator>
				<category><![CDATA[Advanced SEO]]></category>
		<category><![CDATA[Case Studies]]></category>
		<category><![CDATA[Search Marketing]]></category>
		<category><![CDATA[SEO Introduction]]></category>
		<category><![CDATA[Site Information]]></category>

		<guid isPermaLink="false">http://www.timnash.co.uk/?p=348</guid>
		<description><![CDATA[Tim is running a seminar pre Think Visibility the first public one he has done in a couple of years do you want to be at it?]]></description>
				<content:encoded><![CDATA[<p><strong>September 11th, 1pm onwards, Leeds UK <a href="http://seminar.timnash.co.uk/">signup here</a></strong></p>
<p>What you want more? Seriously I mean what do I need to tell you?<br />
I will be running a seminar on <a href="http://seminar.timnash.co.uk/">Profiling website users</a>, looking at life cycles and a huge pile of tricks and ideas to get the most out of your users. Want to know if colour effects genders differently? Want to know how to estimate gender in website visitors?</p>
<p>Then you should be attending this seminar! What&#8217;s more its free well sort of, its actually free to <a href="http://www.thinkvisibility.com">Think Visibility</a> attendees everyone else its £50 but of course what this really means is if you buy a <a href="http://thinkvisibility.buildevents.com/">Think Visibility</a> ticket at £99 and use my discount Coupon TIMNASH the price will drop down to such a level that you got to ask yourself why not come to both <img src='http://www.timnash.co.uk/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p>I&#8217;m afraid while this seminar will be filmed it will be only for internal use and not available for purchase, if it&#8217;s a success I might run a future seminar and film it.</p>
<p>If you are interested in improving your conversions, or even just curious as to what it is that I find more interesting then SEO (or if your into SEO why what we do is so much more then the normal) then come along for the afternoon it&#8217;s free, simply register to confirm your place at the <a href="http://seminar.timnash.co.uk/">behaviour modelling seminar</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.timnash.co.uk/09/2009/behaviour-modelling-seminar/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>So long and thanks for all the fish!</title>
		<link>http://www.timnash.co.uk/06/2009/so-long-and-thanks-for-all-the-fish/</link>
		<comments>http://www.timnash.co.uk/06/2009/so-long-and-thanks-for-all-the-fish/#comments</comments>
		<pubDate>Mon, 29 Jun 2009 15:22:44 +0000</pubDate>
		<dc:creator>Tim Nash</dc:creator>
				<category><![CDATA[Advanced SEO]]></category>
		<category><![CDATA[SEO Introduction]]></category>
		<category><![CDATA[Site Information]]></category>

		<guid isPermaLink="false">http://www.timnash.co.uk/?p=321</guid>
		<description><![CDATA[Tim says good bye from the SEO industry and steers the blog on a new course and hopefully onto further adventures are you coming to?]]></description>
				<content:encoded><![CDATA[<p>I have been living a lie, or rather this blog has you see people assume I work in the SEO industry and to be fair the blog is to blame. For starters the word SEO appears a fair few times, quite a lot of the posts are on SEO and I even appear to offer SEO courses. The reality is very little of the work I or Venture Skills do is SEO related these days we have a few straggling clients on long term contracts and we work in a supervisory role for many clients auditing other SEO companies normally without their knowledge on behalf of concerned clients. This work will continue as will providing our link mapping and reputation management tools to clients but both this blog and the company are going to evolve with new branding and public persona to introduce what we really are&#8230; </p>
<h3>Why is timnash.co.uk changing now? </h3>
<p>Well to be honest I have gotten bored of people pigeon holing me as an &#8220;SEO&#8221; its not a title I have ever been comfortable with and as far as I&#8217;m concerned there are less then a handful of true SEOs in the world the rest are failed web designers who need to convince people that rather than do a real job they should be paid to fix bad code mistakes and write crap to gain links. Don&#8217;t get me wrong many of these people are my friends and some are incredibly talented writers, programmers and developers but virtual none of them are &#8220;Search Engine optimisers&#8221; as an industry there is little or no understanding of what a search engine is, the mathematics or logistics behind large scale information retrieval or even the process of crawling, retrieving and indexing. This is not a problem for me and I think its on a whole a good thing certainly has been for our clients, while the industry sits in the mud throwing its own muck at each other a silent group have just been getting on with the job but I just don&#8217;t want to be associated with the mud slinging. Some say change should come from the inside but lets face it a quick look at SEO cesspools such as Sphinn will quickly show that it&#8217;s not going to change soon. Plus as my personal interests shift away from SEO to different aspects of information retrieval and data analysis I find it harder to be interested in the day to day politics and winging of the industry I have already stopped attending or speaking at SEO conferences with the exception of smaller ones such as <a href="http://www.thinkvisibility.com/">Think Visibility</a> and then I pick and choose  based on what other things the conference is bringing. </p>
<h3>Venture skills is a data analysis company</h3>
<p>We really are and always have been! We act as a trouble shooting think tank, we collect, measure and report data sets making recommendations on how to proceed, be that on helping clients optimise their sites for search engines, improve conversion rates or predicting the spread of flu within a region. We work both online and off with a clients as diverse as the government through to high street stores to online businesses. Venture Skills will no doubt continue to brand itself as an Information Architecture firm we believe ultimately that is what we do, but we shall put a renewed emphasis of the IR and data analysis side of what we do. In part to help this we plan on introducing a new website though we will not be including a blog the current Venture Skills blog will remain live but not be updated after this week. The new site will introduce what we do and why hopefully will answer the age old question what exactly is &#8220;stuff&#8221; I&#8217;m sure we as a company will put some sort of press release like thing out as unlike certain organisations this is my personal blog and not a mouth piece for Venture Skills so until its announced officially this is just here say.</p>
<h3>What is to become of timnash.co.uk?</h3>
<p>It will evolve the SEO course, Q&#038;A section will be taken down, the front page will no doubt have newer text and I am slowly introducing new ideas (such as the additional post notes which are in the side bar, you can see them being used in this article) to be honest the content probably won&#8217;t change to much. Recently I have been discussing profiling, metrics and analytics more then SEO and this will continue with my goal to make these more accessible and hopefully open peoples eyes to this fascinating aspect of data analysis. I will also be introducing some more programming and examples (even if I have to fake the data to protect clients privacy) what this blog will not become is a &#8220;how to make money online&#8221; blog tried that it&#8217;s boring! Occasional articles on how I have improved conversions on one of my own sites maybe but I&#8217;m not going down the internet marketing guru route. </p>
<h4>Will you be offering SEO consulting services?</h4>
<p>Nope timnash.co.uk will not offer dedicated SEO services, all the courses,Q&#038;A pages will be removed though I will be introducing a &#8220;have Tim in the office for a day&#8221; option for companies that need a total boost and change of perspective. The idea comes from the original idea for Venture Skills, rather then money introducing Skills into a business and while the company never went down that route it&#8217;s something I have always thought was worth exploring. I still have to think about how best to implement the idea but more to come soon.</p>
<h4>Will the new timnash.co.uk feature the old picture what about old features oh and the mannequins?</h4>
<p>No to the photo since I have gone through <a href="http://www.timnash.co.uk/04/2009/age-regression-therapy/">age regression</a> I feel no shame in updating the photo with the younger, beardless, hairless person I have um become <img src='http://www.timnash.co.uk/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  In addition I have a few new idea, including a calendar I promise to keep  up to date with which events I will be at and what I&#8217;m doing maybe even before I go to them! Finally while I will be updating the look the mannequins will be staying and may even become more prominent if anyone has any 3d skills and once to help me create some more then get in touch!</p>
<h2>Tim Nash you said SEOs are crap I hate you!</h2>
<p>cool, bye!<br />
I no doubt I will loose a couple of subscribers but lets face it I have never been in the cool club, I was always the person you came to when everyone else had said they had no idea best go and ask the geeky ones in the corner. So if you feel where I&#8217;m taking the blog and hopefully you is not somewhere you want to come then farewell, I hope you found the previous posts useful and will still pop by occasionally. I doubt the move will bring in new subscribers going from a niche to a tiny niche but hey if I wanted subscribers I would write top 10 lists! </p>
<p>I don&#8217;t want this post to be overly negative or personal within the &#8220;SEO Industry&#8221; I have made some great friends and while my opinion of the industry is harsh that doesn&#8217;t mean I think any less of those people nor does it mean I have had a bad time I have enjoyed debating, testing and sometimes arguing with people and for that reason I want to say:</p>
<p><em>So long SEOs and thanks for all the fish!</em></p>
<p>For those still wanting to carry on this journey then you can always subscribe to the RSS feed and you can find me on <a href="http://www.facebook.com/timnash">Facebook</a> | <a href="http://www.linkedin.com/in/tnash">LinkedIn</a> | <a href="http://www.twitter.com/tnash">Twitter</a><br />
Please note: if I don&#8217;t know you I&#8217;m not likely to accept a friend invite on Facebook and if I do and your attached as a limited profile don&#8217;t get to upset so are most people</p>
<p>This article is the opinion of Tim Nash not Venture Skills see footer for details <img src='http://www.timnash.co.uk/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.timnash.co.uk/06/2009/so-long-and-thanks-for-all-the-fish/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>Active vs Passive Profiling</title>
		<link>http://www.timnash.co.uk/06/2009/active-vs-passive-profiling/</link>
		<comments>http://www.timnash.co.uk/06/2009/active-vs-passive-profiling/#comments</comments>
		<pubDate>Tue, 23 Jun 2009 13:02:34 +0000</pubDate>
		<dc:creator>Tim Nash</dc:creator>
				<category><![CDATA[Advanced SEO]]></category>
		<category><![CDATA[Case Studies]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.timnash.co.uk/?p=309</guid>
		<description><![CDATA[Looking at the difference between passive and active profiling and how over using active profiling can mean your bounce rate is higher then the increase in conversions for those staying on site.]]></description>
				<content:encoded><![CDATA[<p>When attempting to profile and group visitors you want as much information as possible about them to make profiling easier but sometimes getting this information can come at a cost of interrupting the users flow are such techniques ultimately worth it?</p>
<h3>Passive Profiling</h3>
<p>From the moment a visitor enters a site a wealth of information is provided by them including hopefully where they have come from, their browser, their physical location (of their ISP at least), their language all of this can and is faked by a small minority but for the vast majority this information can be treated as reasonably accurate. Even with this limited information we can make educated guesses about a user for example we can guess if the user is at work or not. </p>
<p>Once the visitor starts to interact with the site we gain even more information for example once a suitable number of clicks have been recorded we can start to use behavioral models to attempt to estimate gender and age <a href="#post_notes">See Post notes for details</a>. Return visitors provide even more information how do they return direct or through the original referrer? All of this builds up a picture which helps to determine how to present the site to the visitor and change them from a visitor to a “user” but this sort of profiling requires the user to stay on the site long enough to profile the visitor. 10 clicks may not sound much but that could easily be a completed transaction or a lost sale before passive profiling has had a chance to determine the best route for the visitor.<a href="#post_notes">See Post notes for details</a><br />
<a href="#" id="active" name="active"></a></p>
<h3>Active Profiling</h3>
<p>Active profiling in this context is confronting the visitor into a decision making process rather then allowing them to wander around the site on their own. This sort of profiling is not without risk people baulk at being told what to do and especially when a site does not operate in preconceived boundaries of how a site should behave. For many people the website asking questions just isn&#8217;t cricket!</p>
<p>The key to active profiling it to force the user into making a decision which can then be profiled this is usually done by removing navigations or hiding content behind some sort of form. By giving the user no real choice (other then to bail) but to tell us something we can then use to show them the next step in our profiling. Let&#8217;s take a really simple example if I am selling software I may want to identify larger corporate customers from small businesses I therefore present visitors with a screen with two options one for small businesses and one for corporate customers. Each leads to a different landing page selling the same software but highlighting different features of the software simple and pretty un-invasive we are adding a step between getting the information and this will increase the number of exits at that point but if those that stay convert higher percentage it&#8217;s worth losing a few visitors to gain the information needed to improve conversions.</p>
<h4>Pros of Active Profiling</h4>
<ul>
<li>Gather information that you just couldn&#8217;t gather through other means</li>
<li>Confirm information gathered through passive profiling</li>
<li>Can be introduced directly into visitors funneling process</li>
<li>Much quicker and more reliable data then passive profiling</li>
</ul>
<h4>Cons of Active Profiling</h4>
<ul>
<li>Higher bounce and exit rates</li>
<li.“Just click mentality” - particularly if over used so a visitors clicks randomly rather then providing answers</li>
<li>Can cause confusion to visitors</li>
</ul>
<p>The best method is of course to combine the two methods to gather as much information as possible while keeping the active profiling limited and filling only vital gaps. </p>
<h3>Landing Pages &#8211; Case Study</h3>
<p>Lets take an example of a site selling laptops to households it&#8217;s target demographic is 18-30yrs olds in the UK. </p>
<p>On arrival to the site notices the top of the site has a loading icon in reality the site is well optimised however the site is generating the new arrivals profile to put them initially into one of 6 groups. Male/Female/Neutral UK/Not UK to do this the page is pulling the IP address to confirm country if not able to identify assume not UK, and using css history to determine gender (if this fails pesky firefox user! Then gender neutral) </p>
<p>Once the initial profiling is generated the assets are loaded a “British male” is presented with a female “Kirsty” , “British female” would get “Craig” while other would get “Barry” each have certain charms designed to entice and more importantly distract the visitor similar but more generic characters appear for the other 3. These characters then ask a question with four choices (this is our active profiling) Once loaded the characters have a hover element track on them to help determine how distracting the characters are being so that on future pages they can be brought to the fold or are pushed back into the background. The only two pages where the visitor is not determining the character position (though of course the visitor has no idea they are or it would ruin the effect) are this initial page and the last checkout page where they are always at the front almost egging the user on. </p>
<p>Once the user has “answered” the characters question they are further subdivided into groups from this point the wording, colours and location of buttons are determined by the profile they are in. In effect it determines if “Kirsty” talks tech to you or blue is your colour!</p>
<p>This sort of segmentation is highly effective if you already have a tight demographic you are targeting. </p>
<h3>Preventing Comment Spam and Payment Fraud – Case Study</h3>
<p>Something we have been working on for a little while is the way to identify users that are most likely to be spammers prior to them actually leaving the comment rather then just relying on identifying the comment as spam. For truly automated spam this is pretty easy to do, for human spammers (those people being paid through systems like amazon turk to leave comments) it&#8217;s a little harder to identify but not much. By using a first past post point scoring system so no one issue will cause the visitor to be considered a problem you can build up a passive profile and compare it to a typical spammer who will have a specific click pattern, hover over search words and many using blocks of known IP, or indeed from a certain country in some blog cases. Once the passive profile has been matched the active profiling is engaged and forces the user into a simple question “Are you a spammer?” ok so not quite however when the visitor makes a comment rather then an immediate submission the user is presented with a logic puzzle or simply a question this is not to prove they are human as we actually know that already! But to see if they have engaged with the site or the post. If they pass through this active profiling then the normal spam prevention kicks in, if they don&#8217;t then they can be subjected to what ever torments the webmaster has in store for them (a java applet is always cruel). </p>
<p>Similar techniques can be used to identify potential fraudulent transactions where users passive profiling might indicate their behavior is “suspicious” the active aspect would be used to reiterate part of the transaction process.</p>
<h3>Over using Active Profiling</h3>
<p>This post was inspired when my friend <a href="http://www.andybeard.eu">Andy Beard</a> pointed me to a marketing tool which uses active profiling (though being marketers they added 400 additional buzz words) via a quiz to generate sales pages in what you could consider to be traditional quizes. When he initially pointed me to the program I was presented with a quiz asking quite leading and obvious questions before I could “see the video” after the second question I simply started randomly clicking until it went away it was only then did I realise the thing that was irritating and annoying was in fact what Andy wanted to show me! </p>
<p>So why did it fail? Well over use of active profiling results in “oh get out of the way” mentality where the user blindly clicks or feeds false information to just get rid or past the profiling system. This is a triple disaster you not only haven&#8217;t gathered useless information, you don&#8217;t know you have gathered duff information (unless you are confirming passive profiling and are double checking for errors) and you have an upset visitor more likely to bail. The solution is to use active profiling sparingly and only as part of a funnel sales process.</p>
<p>Do you use any of these techniques? If so have they helped?<br />
If you have found this article interesting you might also like to take a look at my recent introduction to this type of <a href="http://www.timnash.co.uk/12/2008/profiling-multivariate-landing-page-users/">user profiling</a>.</p>
<p><
<div id="vs-message">
<strong>Stuff Consulting</strong><br />
Are you interested in Profiling and Grouping your users then why not think about hiring a Stuff Consultant! See my <a href="http://www.timnash.co.uk/consulting/">consulting services</a> for more information or why not <a href="http://www.timnash.co.uk/contact/">get in touch</a>!</div>
]]></content:encoded>
			<wfw:commentRss>http://www.timnash.co.uk/06/2009/active-vs-passive-profiling/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Case Study &#8211; Profiling Landing Page Users</title>
		<link>http://www.timnash.co.uk/12/2008/profiling-multivariate-landing-page-users/</link>
		<comments>http://www.timnash.co.uk/12/2008/profiling-multivariate-landing-page-users/#comments</comments>
		<pubDate>Sun, 14 Dec 2008 13:20:39 +0000</pubDate>
		<dc:creator>Tim Nash</dc:creator>
				<category><![CDATA[Advanced SEO]]></category>
		<category><![CDATA[Case Studies]]></category>
		<category><![CDATA[Search Marketing]]></category>

		<guid isPermaLink="false">http://www.timnash.co.uk/?p=221</guid>
		<description><![CDATA[User Profiling is a vital but often missed tool when developing landing pages, in this case study I show the steps we went through to increase a clients landing page conversions not by changing the variables in the landing pages but segmenting the traffic source and profiling it. Warning this post contains some logic concepts.

]]></description>
				<content:encoded><![CDATA[<p>After posting about doing some testing on my Facebook status (I don&#8217;t use twitter) I got a couple of queries about what I was up to, my current project is a bit hush hush so I thought I would introduce this case study. I have also provided what I hope are some handy hints dotted through out.</p>
<hr />
<p>Affiliates are great no seriously they are the online sellers best asset without them many of the big stores would barely make a penny. So its strange how so many of them take their affiliates for granted treating them as one homogeneous block. We were asked to do some work for a client and I thought I would share some of the theory behind it (warning this post contains logic and a sprinkling of basic math) even though its not strictly an SEO post many SEOs are asked to perform this sort of statistical analysis every day.</p>
<p><strong>The problem</strong><br />
Company A sells a diverse range of products it has a large affiliate base and are keen on their affiliates doing the leg work. Their philosophy is to help the affiliates along as much as possible. While they don&#8217;t have a dedicated affiliate landing page (they prefer to encourage deep linking to specific products) their “special offer” pages are used within PPC campaigns and as a focal point for affiliates activities. These pages are regularly tested however traditional split testing has caused a problem. Several of their high earning affiliates have low conversion rates on these pages while the PPC campaign is doing well and according to traditional split testers this page is the best performing of all recipes that had been presented over a large dataset.</p>
<p><strong>The traditional solution</strong><br />
We have numerous traffic sources all arriving at one area the traffic is bound to be looking for different things and at different points along the buying process it makes sense therefore to simply split this traffic into groups and optimise each group. So we now have 3 separate experiments running each with the same recipes.  The results come in the PPC is up as is the misc organic traffic but conversions while up slightly are still disappointing for the affiliates, time to investigate whats going on.</p>
<h3>Interrogating demographics</h3>
<p>Eccomerce sites have a wealth of information about their users, their buying practices and with a little bit of data mining we can create a profile of shoppers (we have 12 in this study) on the site. When we match that with where they have come from we can begin to organise what sort of user is coming from our various traffic sources. Obviously there is crossover and sources such as PPC and Organic traffic should produce a diverse set of profiles as will affiliates on the first glance it is only when you break down your affiliate traffic by affiliate that the number of profiles reduces. </p>
<p>Since I really wasn&#8217;t interested in Affiliates making a sale a year I quickly discarded a few thousand records and associated bits to a core group of 500 or so,  each ones sales patterns were looked at and users profiled. Interestingly the top 5 affiliates had far less cross over in profiles then those earning a little less. This points to the fact that they are using distinct niches and they were tending to not step on each others toes. The next step was to compare these profiles to our conversion information for our landing pages were they way outside of the normal profiles?</p>
<hr />
<strong>Quick tip</strong> – Profiling and data mining is one of the key elements to eccomerce that even some of the big boys can&#8217;t manage to get right! Take amazon as an example of company with a flawed profile and data mining set. While its clear they are profiling users they are not cross referencing with previous purchases causing them to send emails suggesting you purchase books you already bought from them! If your interested in profiling and customer data mining I strongly recommend <a href="http://www.amazon.co.uk/gp/product/0749453389?ie=UTF8&#038;tag=timnasblo-21&#038;linkCode=as2&#038;camp=1634&#038;creative=6738&#038;creativeASIN=0749453389">Scoring Points: How Tesco Continues to Win Customer Loyalty</a> its a good read over the Christmas break.</p>
<hr />
<p>Our best performing profiles on landing pages were 3 distinct profiles to keep things simple we shall call them the bargain hunter, the infrequent and the impulse. In reality each of these had 2 or more subset profiles but we can quickly see the special offers landing page are converting people who are looking for bargains, who don&#8217;t spend often with the store and those whose shopping patterns are erratic but often put more then one thing in the basket before checkout. Now this is not surprising but its always nice to have stats back up common sense particularly as its not always the case! Our top affiliates users profiles were on the whole not part of our profiling grouping of high conversions though there was some crossover with impulse profile amongst them. Interesting affiliates were producing the highest returning customer group even from this pages (indicating even if they didn&#8217;t buy the deal they were remembering the site) . Through out all of this one affiliate was standing out his conversion rate was close to 25% while his traffic stats were similar to the rest of the top affiliates these figures were so high they were hard to believe and our client was suspicious something was afoot. His traffic was almost entirely a single profile and were purchasing specific products and only those products for our client his users were infuriating not responding to future mailings, rarely returning accept through his link to specific products something was clearly going on, could he be part of some secret shopping sect? Is he brainwashing? Or was something darker going on?</p>
<h3>Investigating an Affiliate</h3>
<p>The vast majority of affiliates get on with their affiliate managers its a symbiotic relationship one can&#8217;t exist without the other. Therefore most affiliates don&#8217;t mind when their manager calls them for a chat often its going to be over some offer or a rise in commission when someone refuses to communicate it signals if nothing else a deterioration in their relationship. When we were unable to contact this particular affiliate it seemed that we might have a bad egg it was time to unravel his traffic sources. Like most affiliates he used a couple of central redirects in an attempt to both hide the real source but also for stats gathering, a whois and DNS lookup allowed us to find the fact both of these were on a single server and after a few minutes we had a 122 domains mainly .infos to look through passing those through our link mapping software produced nearly 400k backlinks across the network, nice thats a weekend of fun. Thankfully manually scanning through such a large dataset is not really feasible. Taking a small sample of the 122 indicated most could be considered as “made for adsense” style sites scraped or just crud content with links pointing to other parts of the network and a pile of ads. Thankfully the link mapping also provided 6 domains (all hosted on separate servers) where the network was pointing to. So far we still have a suspicion and some dodgey link building methods certainly nothing to damn the affiliate we now also have the main sites he has been promoting.</p>
<hr />
<strong>Quick tip</strong> – People tend to think micro instead of macro just because you made your site look less spammy on a human glance doesn&#8217;t help if your linking into a large network. Companies and search engines look for large scale patterns and work backwards which is why paid links from brokers are so easy to identify if you have a large enough data set.</p>
<hr />
<p>6 sites some interesting cloaking to provide overly keyword stuff context to Google but on the whole these sites have good landing pages, they are auto generated and producing coupons for every product to get discount. They are also incentivising the offers through a ponzi scheme (the cheeky bugger!) basically incentives were only being issued if a user not only bought the product with the coupon and then recruited 3 other users who did the same. The whole thing was financed through the fact that the affiliate was earning a standard commission regardless of the coupon and so was using the difference to payout in the unlikely case he needed to, I have no doubt when you joined “the club” you would be put on a mailing list and treated to a bombarding of mailshots. The system was ingenious but totally against our clients TOS and depending exactly how he was funding the payouts possibly illegal in the UK. So having ruled out our super affiliate as being the answer we handed over our findings to client for them to deal with and went back to the original question.</p>
<p>With our anomaly ruled out as a way to improve our affiliate conversions it was time to start gathering information on the presale our affiliates were using, this occurred more or less the same way we investigated our first affiliate, in addition each affiliate was sent a very quick questionnaire and a couple of follow up conversations with several of the affiliates and we probably knew more about our clients affiliates then they did! </p>
<p>Our clients affiliates ran into four main groups, the lander, the blogger, the comparer and the  mailing list guru. The landers are affiliates using lots of mini sites to push a certain product often very much into PPC and SEO they push very targeted products. The bloggers run blogs in our clients case they were mainly gadget and sport blogs they would be targeting individual products  but would often reuse our clients site over and over again. They were the group with the most residual traffic after a launch. The comparison websites are big business for affiliates and with some studies showing 1 in 4 people now using a comparison site for finding online purchases its no surprise that 2 out of our clients top 5 affiliates were primarily using comparison sites to drive traffic. Finally the mailing list such a powerful tool in Internet Marketing was actually the least used but as our client mentioned when we started they always knew when 1 of their affiliates released his newsletter by a small spike, the downside this affiliate brought almost no long term traffic.</p>
<hr />
<strong>Quick Tip</strong> – Its interesting to note most big affiliate marketers while they might have fingers in many pies tend to have one major traffic source type even if they have multiple sites and lists. However the top affiliate in this study (ignoring the club owner) used a blog, mailing list combination which in turn had comparison charts.</p>
<hr />
<p>With our affiliates traffic sources and profiles in mind we went back to the split testing to provide a unique and adapting page for every affiliate was not practical so instead we took our 4 traffic source groups and then cross referenced their user profiles the result was 7 diverse profile groups each was then run in an experiment of their own with the original recipes. Results were pleasing with all 7 experiments seeing a large leap in conversions it was interesting that out of the 7 experiments none were exact matches with each other yet our comparison website profile experiment matched the PPC. </p>
<blockquote><p><strong>Explanation</strong> – Ok someone is bound to ask, if we have experiments which rely on profile information and we don&#8217;t have any user data how are we assigning a user profile prior to gathering the information? The answer is we or rather our testing software makes an educated guess, we then refine the results later and push it back into the correct group as a corrupt test. The actual guess is generated using a genetic algorithm with demographic info we do have available introduced we then use that demographic data as a means to cause mutation within our default population, to provide an initial population in both experiments we include two or more mutators (profile indicators) and we can feedback information from successful sales the result should introduce two  or more variants once we have enough feedback information we can remove or scale back our initial mutators.</p></blockquote>
<h3>Alternate Approach – Evolution</h3>
<p>While we do rely on the use of genetic algorithm math to make educated guesses about profiling it can actually be used to generate the entire landing page profiling as an alternative to our experimentation. Rather then considering our source as the population we would consider the page to be the population with our testing variables individuals since genetic algorithms are ideally suited to provide a positive or negative result of an individual within a set we can use the same math to “breed”  our individual (testing variables) to identify strongest matching pairs which can in turn be tested against one or more pairs or individuals. Their have been several interesting experiments on this type of split testing and its something we have found works well on large data sets. However the additional processing power may not justify its use unless the results show a marked improvement in conversions.</p>
<hr />
<strong>Quick Tip</strong> &#8211; A lot of Internet marketers will refer to a statistic technique called  Taguchi Method indeed many see it as a holy grail as it allows you to run experiments with what would be traditionally considered small data sets and return results similar to large scale tests. Don&#8217;t be fooled down this route if your dataset is not large enough to be running large multivariate tests then concentrate on smaller A/B split testing and looking to increase your dataset size any gains in using statistical analysis methods will be minimal at this size range anyway.</p>
<hr />
<p><strong>Quick Tip</strong> – Statistical data collected is temporal it is effected by time an obvious example is a searcher looking for Funny Christmas Hats today (mid December) is likely to be a different profile to one looking in January the search maybe the same as perhaps the intention but chances are the result is not. </p>
<hr />
<h3>Affiliates are not a single group</h3>
<p>The client site now when running a special offer page has 9 experiments running on it reflecting 12 profile groups and 6 distinct traffic sources. The alternate without analysing both the profiles and traffic sources would have been to assume that each traffic source was capable of producing all 12 profiles (which they are but in most cases statistical negligible) resulting in 72 separate experiments. Without profiling we would never have been able to identify personality groups and would have relied on traffic and guess work the key is to remember that testing is only useful if you know what variables are actually controlling the test. </p>
<hr />
<p>I hope this little study has helped people to see the potential of mining your data and that of your affiliates but its really be very much theory based so here are some handy and a bit more practical tips.</p>
<h3>Generating Profiles</h3>
<p>You cannot know your customers individually and eventually you have to accept that you are going to have to start grouping them. Using their spending habits along with other demographic information you should be able to split your users into between 6 to 18 distinct groups. Once you have your profile groups you can target and market to these groups.</p>
<h3>Pool your data externally</h3>
<p>Most sites do not have all their customer data in one place, apart from order and purchase logs, stats will be held elsewhere and questionnaire data in yet another location. Keeping in mind the concept of temporal data it is far easier to take snap shots of your data sources and pool them for analysis while most work is automated initial analysis and some interim work still needs to be compiled so having a stats package handy is always good. While I know most can&#8217;t justify a copy of Crystal, Matlab or SPSS there are free alternatives that will work well for you such as the <a href="http://www.gnu.org/software/pspp/">open source PSPP</a></p>
<hr />
<strong>Quick Tip</strong> – Make sure your temporal data is in the same timezones, particularly for non US users who servers are in the US.</p>
<hr />
<h3>Is it a gift?</h3>
<p>During the purchasing process its a great idea to understand the reason for purchase this simple easy to answer question allows us to discard the user within our normal product profiling, whats more it allows a simple upsell for adding wrapping and alternate delivery at a small cost.</p>
<h3>Interrogate your data not your users</h3>
<p>Questionnaires are a great source of information though most opinion is subjective the demographic data is not but  people who answer such questions will lean into one or more profile groups. When asking questions to users keep it short and sweet you are looking for ways to “enhance” their future experience not put them off ever shopping with you again.</p>
<h3>Think Tescos – Unique Coupons</h3>
<p>If I have a set of profiles which are unique from purchase history and I have a users purchase history it should be easy enough to provide a unique coupon designed to encourage them back to the store. Tesco are experts at this through their clubcard but very few online retailers use targeted coupons mainly because of delivery methods. But by providing a monthly newsletter with generic material chosen based on purchase history with unique coupons based on purchase history and profile a user can be tempted back.</p>
<p>A simple example 2 users both interested in computers both purchased several items in the past. Customer A is an impulse buyer – his coupon is a discount % of a single product the discount is substantial but the site makes money on the chance that he will add more items to the trolley at the full price. Customer B is a bargain hunter he is presented with a % of all items which is significantly lower then customers A percentage off. However customer B is likely to purchase only a couple of items and so the store wishes to retain some markup on the items.</p>
<p>The downside to this method is the sheer amount of data that is required plus maintaining a high delivery method (assuming email this means running your own mailing list system not using a 3rd party like Aweber) </p>
<p><em>What methods do you use for profiling users? Do you think we can pigeon hole customers into 6-18 little groups?<br />
</em></p>
<div id="vs-message">
<strong>Stuff Consulting</strong><br />
Are you interested in Profiling and Grouping your users then why not think about hiring a Stuff Consultant! See my <a href="http://www.timnash.co.uk/consulting/">consulting services</a> for more information or why not <a href="http://www.timnash.co.uk/contact/">get in touch</a>!</div>
]]></content:encoded>
			<wfw:commentRss>http://www.timnash.co.uk/12/2008/profiling-multivariate-landing-page-users/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
		<item>
		<title>Its all in the mp3s</title>
		<link>http://www.timnash.co.uk/06/2008/seo-mp3/</link>
		<comments>http://www.timnash.co.uk/06/2008/seo-mp3/#comments</comments>
		<pubDate>Fri, 27 Jun 2008 14:34:06 +0000</pubDate>
		<dc:creator>Tim Nash</dc:creator>
				<category><![CDATA[Advanced SEO]]></category>

		<guid isPermaLink="false">http://www.timnash.co.uk/?p=167</guid>
		<description><![CDATA[It&#8217;s been nearly two weeks since I last blogged anything and this is because I have been stuck with a bit of a problem. I went to Mashed last Saturday and Sunday (June21 2008) it was like the previous years Hackday good fun though perhaps a little dull in comparison. I won&#8217;t lie I was [...]]]></description>
				<content:encoded><![CDATA[<p>It&#8217;s been nearly two weeks since I last blogged anything and this is because I have been stuck with a bit of a problem. I went to Mashed last Saturday and Sunday (June21 2008) it was like the previous years Hackday good fun though perhaps a little dull in comparison. I won&#8217;t lie I was disappointed with the talks on a whole but then with 30 minute slots its never easy, the one talk that did spark some interest was from the BBC on accessibility and covered the use of subtitling in videos and enhanced meta data in podcasts. </p>
<p>In fact this was enough to inspire a hack, <a href="http://www.newmedias.co.uk/mashed2008/">we do ID3</a>, it was interesting the BBC had decided to put Binary data inside Binary Data (i.e images) rather then concentrated on additional contextual data in there enhanced podcasts but that doesn&#8217;t mean other people have to.</p>
<p>So why the silence&#8230;</p>
<p>Well it worked not the hack my <a href="http://www.porkandpaws.com/2008/06/22/accessible-media-mashed/">hat tip to Shaun</a> who coded amazingly well considering at one point we were talking about the intricacies of binary extraction using nothing more then hand gestures and some very <a href="http://blogs.guardian.co.uk/digitalcontent/2008/06/_mashed_2008_the_bbcs_amazing.html">dodgy diagrams</a>. No the silence was caused by the fact I came back and thought about where the id3 data was being kept and how the headers worked.<br />
<img src="http://www.timnash.co.uk/wp-content/uploads/2008/06/mashed1.jpg" alt="Dr Who how cool is that" /><br />
So there you have it success has made me silent <img src='http://www.timnash.co.uk/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  but to give people a hint imagine how many badly formed html pages Google must see every day yet still manages to rank? </p>
]]></content:encoded>
			<wfw:commentRss>http://www.timnash.co.uk/06/2008/seo-mp3/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Am I an Advanced SEO?</title>
		<link>http://www.timnash.co.uk/06/2008/advanced-seo/</link>
		<comments>http://www.timnash.co.uk/06/2008/advanced-seo/#comments</comments>
		<pubDate>Thu, 12 Jun 2008 13:35:12 +0000</pubDate>
		<dc:creator>Tim Nash</dc:creator>
				<category><![CDATA[Advanced SEO]]></category>
		<category><![CDATA[SEO Introduction]]></category>

		<guid isPermaLink="false">http://www.timnash.co.uk/?p=158</guid>
		<description><![CDATA[There has been a fair amount of debate on what constitutes advanced SEO in the Search Marketing Blog&#8217;o'sphere mainly coming from a couple of comments reactions to SMX Advanced. For those of you who haven&#8217;t heard of this event which is organised by Danny Sullivian and was a conference with all the normal faces saying, [...]]]></description>
				<content:encoded><![CDATA[<p>There has been a fair amount of debate on what constitutes advanced SEO in the Search Marketing Blog&#8217;o'sphere mainly coming from a couple of comments reactions to SMX Advanced. For those of you who haven&#8217;t heard of this event which is organised by Danny Sullivian and was a conference with all the normal faces saying, from what I can gather, all the normal things. From a couple of people who did attend (I did not so this is hearsay) the most advanced thing there were the Vending Machines. </p>
<p>It is no secret I have become deeply disillusioned with the search marketing bloggers recently, and tend not to pay attention to them as a whole. Then Quadzilla wandered in with the excellent Link Bait all advanced SEO is blackhat!<br />
<img src="http://www.timnash.co.uk/wp-content/uploads/2008/06/binhat.jpg" alt="Bin your Hat" /></p>
<h3>I don&#8217;t believe in hats</h3>
<p>I have the concept of lines in the sand, my line is somewhere beyond the Google Guidelines but even further away from a jail cell. So for me advanced SEO can&#8217;t be blackhat as blackhat doesn&#8217;t exist in my world. But this leaves a bigger question is there anything that is &#8220;Advanced SEO&#8221;?</p>
<h2>Advanced SEO?</h2>
<p>I want to reach a consensus of what Advanced is, so here is a list of our day to day techniques and activities, which is advanced if any?</p>
<ol>
<li>No Follow for page sculpting</li>
<li>SERP Hijacking</li>
<li>htaccess redirects</li>
<li>REP</li>
<li>Regex</li>
<li>IP Content Delivery</li>
<li>Link Mapping</li>
<li>Link Analysis &#8211; Link Worth</li>
<li>Crawl Spiders</li>
</ol>
<p>They are some of the activities we do on a day to day basis, we build crawl spiders, develop IP content delivery systems and Map link data layers. Ok, maybe its the skills and knowledge needed that is advanced perhaps these could be where the &#8220;Advanced SEO&#8221; is?</p>
<ol>
<li>Programming &#8211; PHP, Python, C++</li>
<li>Database programming</li>
<li>Unix system programming</li>
<li>Statistics gathering and analysis</li>
<li>Discrete mathematics</li>
<li>Numerical Analysis</li>
</ol>
<p>Ok that some of the skills and knowledge we use but perhaps the &#8220;Advanced SEO&#8221; is the assumed knowledge, maybe it&#8217;s the experience that makes a person an Advanced SEO? </p>
<p><strong>I want to leave you with a question&#8230;</strong><br />
<img src="http://www.timnash.co.uk/wp-content/uploads/2008/06/cat.jpg" alt="Geeky reference" /></p>
<p>Does something exist if your not doing it? </p>
<p>Is the reason that some question the existence of different levels and skills required in SEO because they themselves have never achieved them. It&#8217;s something to think about, how I help my clients is by bringing them the best information I can, through techniques that I think most people simply do not do but are they advanced? I don&#8217;t think so but I wouldn&#8217;t after all I do them everyday!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.timnash.co.uk/06/2008/advanced-seo/feed/</wfw:commentRss>
		<slash:comments>25</slash:comments>
		</item>
		<item>
		<title>Where on the page are your links from?</title>
		<link>http://www.timnash.co.uk/06/2008/on-page-link-location/</link>
		<comments>http://www.timnash.co.uk/06/2008/on-page-link-location/#comments</comments>
		<pubDate>Mon, 02 Jun 2008 11:28:20 +0000</pubDate>
		<dc:creator>Tim Nash</dc:creator>
				<category><![CDATA[Advanced SEO]]></category>
		<category><![CDATA[Search Marketing]]></category>
		<category><![CDATA[link mapping]]></category>
		<category><![CDATA[link patterns]]></category>

		<guid isPermaLink="false">http://www.timnash.co.uk/?p=153</guid>
		<description><![CDATA[In a previous post I discussed Link Worth and prior to that a technique called Block Segmentation Analysis, in which I briefly discussed some research done by Microsoft into Visual Page Segmentation where rather then using the DOM they use actual images rendered from a browser to determine what are blocks. While this is overkill [...]]]></description>
				<content:encoded><![CDATA[<p>In a previous post I discussed <a href="http://www.timnash.co.uk/05/2008/link-worth/">Link Worth</a> and prior to that a technique called <a href="http://www.timnash.co.uk/05/2008/block-segmentation-analysis/">Block Segmentation Analysis</a>, in which I briefly discussed some research done by Microsoft into Visual Page Segmentation where rather then using the DOM they use actual images rendered from a browser to determine what are blocks. While this is overkill for most I was intrigued enough to create our own version with Opera, Imagemagick and some Python but the purpose for my little experiment was not to look for blocks (I had already identified them through the DOM) but the location of individual links on a page.</p>
<p>I have a whole other post dedicated to the subject of how, but quickly each page was grabbed, all block elements (div,span etc) were turned one colour and stripped of all other info to make life easier the block we wish to identify was turned red. Then the page had a grid overlayed (24&#215;24) and each square was checked for colour of it&#8217;s content. </p>
<p>With a working mechanism we then went and pulled a 100 SEO sites (Sites which advertised services rather then just blogs though many did have a blog as well) backlinks (we removed links that were internal and capped it at 200 per site) and mapped to our grid. The result (17680 backlinks later) is the ability to see popular locations for inbound links to SEO websites, which may or may not interest you.</p>
<p>The following is a pseudo heat map, blue is the limited number of links, orange a reasonable amount red is a lot.<br />
<img src="http://www.timnash.co.uk/wp-content/uploads/2008/06/heatmap.jpg" alt="pseudo heat map of links on inbound link pages" /><br />
As you can see, the first thing is there is a lot of blue, the red areas are primarily in side bars, footer and inching into what would be comment area on blogs. The orange is a bit more evenly spread out, of interest though is the lack of serious link cluster in page content themselves. The left hand side is interesting there has been a trend away from left hand side bars and so comments in a blog may be counted in that left hand side.</p>
<p>Useless trivia, out of the 100 SEO sites we tested:</p>
<ul>
<li>86  had 1 or more inbound link coming from footer area</li>
<li>96 had 1 or more inbound link coming from the right hand side sidebar</li>
<li>45 had 1 or more inbound link coming from left hand side</li>
<li>15 had no inbound links in content area on any sites</li>
<li>36 sites had 1 or more inbound links with only image as an anchor</li>
<li>98% of inbound links found in footer to our SEO sites site wide</li>
<li>99% of right side bar links found to our SEO sites are site wide</li>
<li>40% of left had side links found to our SEO sites are site wide</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.timnash.co.uk/06/2008/on-page-link-location/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>
