Robot.txt information

TimBot robot descriptions

WTF is a robot.txt

Tim and his bots

introduction

TimBot | SpyBot | TimBot-Crawl | Marvin
Welcome to my menagerie of bots some people keep dogs or cats even pet rats and I keep pet bots and even my own little virus! This page was set up after a curious soul wanted more information when one of my little pets visited his site. Each of my public bots is here along with a description of what they do, their useragent string and if they pay the blindest attention to REP.


TimBot version 2

TimBot
TimBot is my generic bot and is in reality multiple programs each program has a specific purpose and all are totally benign. If TimBot comes visiting don't panic in only one of its states will it cause any issues as exlained below.
User Agent

Mozilla/5.0 (compatible; timbot/2.0; +http://www.timnash.co.uk/timbot)

As you can see it pretty much mimics GoogleBot string, TimBot has several variations TimBot-crawl (see below) and TimBot-Index Timbox index is shown as:

Mozilla/5.0 (compatible; timbot-scrape/1.1; +http://www.timnash.co.uk/timbot)

or

Mozilla/5.0 (compatible; timbot-index/0.5; +http://www.timnash.co.uk/timbot)

Both are in fact the same program and they are indexing the site and its contents while obviously I would prefer you let them index your site you can stop them using the following in robots.txt file.

User-agent: timbot-index
Disallow: /


Please note that the above will stop both TimBot-Index agents but Disallow: * will only stop timbox-index agent string.

TimSpyBot

TimSpyBot
You really shouldn't be seeing spybot at all!
Two scenarios you might come across Spybot ;

  • If you have followed an IP trail with no or misleading user agent and come across this page it could well have been SpyBot.
  • SpyBot User agent appeared in your logs (cloak testing)

SpyBot has many functions but its used to mimic other browsers it is also used for testing cloaking and other techniques. Don't worry if you were using a cloak and SpyBot visited I'm not Google and if your cloak was good it won't be a problem.
Cloak Testing
The easiest way to identify if SpyBot has visited would be to look for 3 http requests within a few seconds of each other from GoogleBot a Slurp and then the following agent string:

Mozilla/5.0 (compatible; timspy /0.2; +http://www.timnash.co.uk/timbot)

The earlier name for this bot was CrazyNinjaBot and I'm afraid it blatently ignores all REP directives.

TimBot-Crawl

TimBot-Crawl
Pronounced 'Tim-Buck-crawl' is my crawl agent this bot simply is recording HTTP status codes it doesn't index the page just records what status was returned its a very caring little bot and causes no problems but it also ignore REP.
UserAgent

Mozilla/5.0 (compatible; timbot-crawl/0.3; +http://www.timnash.co.uk/timbot)

This is probably the most common bot as he is related to several on going projects ad is often out checking to see if pages are alive and kicking.

Marvin aka TimBotMarvin

Marvin our StumbleUpon Grump
Marvin is the grump of the family designed and built to gather data on StumbleUpon users you may have seen Marvin coming from your StumbleUpon profile or someone favourites page. Marvin is probably best described as a meta crawler its job is to identify what sort of page it is on and to try and build patterns of what topics users are interested in.
User Agent

Mozilla/5.0 (compatible; timbot-marvin/0.3; +http://www.timnash.co.uk/timbot)

or

Mozilla/5.0 (compatible; stumbleGrump/0.1; +http://www.timnash.co.uk/timbot)

Both Agents will obey Disallow * and will also obey

User-agent: timbot-index
Disallow: /