Ok so very quickly Google announced via Webmaster Blog that the ever mighty and slightly naughty GoogleBot has been given new directives to boldly go and fill in forms. Now it is pretty restrictive and currently purely an experiment…
In the past few months we have been exploring some HTML forms to try to discover new web pages and URLs that we otherwise couldn’t find and index for users who search on Google. Specifically, when we encounter a <"FORM"> element on a high-quality site
You can read the whole article over on their blog, while I have yet to test this their is a few obvious things that come to mind…
- What data does it use to fill the form in? Now they partially answer…
For text boxes, our computers automatically choose words from the site that has the form
So that’s really clear!
- By using Get one presumes it is simply generating URL strings and processing them, given this is liable to lead to error messages or similar content how will it cope with duplicate content?
- What is personal data?
- What is a high quality site
- What circumstances would you want Google crawling form results?
- Won’t this increase the chance of weird queries ala a few days ago?
Ultimately Google attempting to crawl more of the web is a good thing right? So why do I feel uneasy? If you are worried about Google gobbling bandwidth or harassing your sales team you could either fix your form or block Googlebot in your preferred normal way. Which is fine, except what happens when the form is on the front page? will we have no follows on submit buttons?
Have you come across Google new crawling experiment if so what did it do? has Google left a comment on your blog because you renamed the fields in your comment form?