I was chatting to a friend the other day about his newly created site (it looked really sweet) and I was looking at the CSS (very neat) but misstyped the URL my whole world came down no literally not only was he using IIS to host the site
he had no custom error pages. A few days later I noticed he was starting to pay attention to this problem, interesting articles started to appear in his del.icio.us account, and I started to think about the humble 404 and other error messages.
Making it look neat and saying sorry
It doesn't matter who did it most 404 are normally down to some one screwing up be it a mistyped link or you moving the file therefore its important that you say sorry but equally important that you get the visitor back on track. Therefore a perfect 404 (or any error page) should
- Explain the error - Both in laymen terms and against common opinion I think its worth explicitly expressing a server response code.
- Say sorry - no real be it humours or serious apologise even if its not your fault
- Get them out of there and to where you want them to go
The 404 is a landing page albeit one with an uphill struggle, It's worth taking time with them as they still can provide a way to convert visitors, they have perhaps one of the most simplest calls to action on the site, to keep the visitors there. Both a List a part and Stuntdubl offer some great advice on designing and developing 404 pages in terms of looks, its worth exploring the 'a list a part' method for providing the user with the greatest amount of information. But remember that a good 404..
- Is simple and easy to understand
- Keeps the visitor happy
- Gets them clicking on your site
I tend to remove all the traditional navigation and any side blocks I may have, to make the page stand out. a simple explanation with a dash of humour finishes the majority of the page, along with links to the home page and areas of interest, while many recommend a sitemap I personally do not include one on this page or link to, as its important the user experience is improved (remember you already have an upset visitor) its therefore better to take the user by the hand on their first step after this experience. Ultimately this is going to be a trial and error process you may like to split test variants of your "404 landing page" and if you are going to use Google analytics then the obvious choice is Google web optimiser but there are plenty of others.
Getting the data
The most important reason we provide 404 custom error pages is to gather information (sorry did you think it was the user experience) something has gone wrong, it therefore needs to be investigated and fixed. To do this we need to know,
- Where did the user come from
- What were they looking for
- where did they go
If your using Google analytics then tracking 404's are easy enough to do, Blogstorm has a great analytics tutorial which includes a section on 404 tracking. <script type="text/javascript" src="http://www.google-analytics.com/urchin.js"> </script> <script type="text/javascript"> _uacct = "xxxxx-x"; urchinTracker("/404.html?page=" + _udl.pathname + _udl.search); </script> Replacing the xxxxx-xx with your own analytics code
We have discussed following outgoing clicks before but to quickly recap, <a href="outgoinglink" onClick="javascript:urchinTracker ('/404/outgoing'); "> With a little work we can gain a vast amount of information and with a combination of filters and email reports we can be notified of errors that are occuring. If you are not already doing so I strongly recommend setting up goals for content on your site and this includes conversion tracking on a 404.
Search engines and related problems
This is going to get mildly technical now, so thinking hats on people Search engines when they crawl sites receive server response codes much like browsers, but where as a browser displays the custom error page not all search engines will crawl such pages, rather its the server response code they are interested in. Through out this post we have been referring to 404 but this is just a cryptic number. HTTP - is the protocol used by browsers and web servers to talk to each other without going into depth its a method of passing messages called packets. The protocol has a number of status messages of which 404 is one. For as complete list visit w3c information on http responses. A 404 is a page not found defined as:
The server has not found anything matching the Request-URI. No indication is given of whether the condition is temporary or permanent. The 410 (Gone) status code SHOULD be used if the server knows, through some internally configurable mechanism, that an old resource is permanently unavailable and has no forwarding address. This status code is commonly used when the server does not wish to reveal exactly why the request has been refused, or when no other response is applicable.
Its the oops we can't find anything to match that request and no further instructions have been left, or put simply "no idea what to do" message. It makes no attempt as to determine if the contents of the page will return if they have been moved, or permanently removed each of which have their own status returns. Most search engines will therefore keep coming back to visit a 404 link for at least a little while before deciding its permanently dead the exception is Google according to Matt Cutts Google treats 404's as 410's Now a 410 is described as:
The requested resource is no longer available at the server and no forwarding address is known. This condition is expected to be considered permanent. Clients with link editing capabilities SHOULD delete references to the Request-URI after user approval. If the server does not know, or has no facility to determine, whether or not the condition is permanent, the status code 404 (Not Found) SHOULD be used instead. This response is cacheable unless indicated otherwise.
Put simply the page has been deliberately removed with no intention of being returned, not quite the same as a 404 is it. This means that in theory Google should not have any 404 pages in its indices or nearly none as it believes that a 404 is a permanent dead link. To be fair to Google I can see their logic most webmasters do not use 410's and instead allow 404 (limbo) to exist on a permanent basis but and this is a big but, not all 404 lead to 410's there are plenty of scenarios where a 404 will spring back to life. The solution to the problem is a to not tell Google 404's but instead send a different error code I suggest 307 which is a temporary redirect (to landing page or home page) this way Google doesn't permanently make a decision about the page, you don't loose any backlinks or link juice from those who might have been linking to the page, and once you find and correct the error either by returning the page, or if the URL is wrong using a 301 Permanent redirect. This method relies on cloaking which you may remember is bad and against Google TOS but I think (maybe a Googler will come and say otherwise) that in this particular case its "not setting out to deceive users" but rather to fix a problem Google has occurred. Of course if you
don't take action to prevent and fix 404's quickly and allow them to remain then you were the very reason Google took the step of treating 404's as 410's and so this extra work is a waste of your time (come to think of it so was this post) but for the rest of us I don't think this mild infringement is a problem.
Summing up
404's are an important part of the site, they not only help you but also your users, they can be a great opportunity to turn round a problem and generally help your visitors. They also provide you with valuable information about visitors and visitors needs.
Update
Smashing Magazine So what's the chance that me and a major website will cover the same subject, check out these great 404 pages, but do any follow my advice? Clever logic Sebastian of Sebastian pamphlets has developed a script that takes the idea of 404's to another level using 301 (permanent redirects) and some clever logic to match keywords to pages and fix misspellings. The reason for this is the underlying problem with 404 is that the moment you show a user a 404 its an uphill struggle to make them stay his goal is to be able to offer alternate content for 99% of the time and therefore avoid any error pages.








Why did you go to the effort of posting that rambling if you’re not going to follow it yourself?
I don’t control wordpress.com so don’t have much choice on how they do there 404’s
hopefully they will find this article and implement one.
404s are by theme.. Sandbox has one. http://internetducttape.com/404 The defaults are never very good though.
Responses to this post: