Robots.txt Optimization

The purpose of these file is to tell the search engine robots or crawler that they are allowed to access my website. At this moment many of you might think that why should they insert the robots.txt file on the root directory of their site. When they want spiders to crawl their website completely and it is their normal duty. Than wait! I have a reply for you, when spiders look for a particular page on your website where that is not available than the normal result is error 404 and these is a known fact. Here comes a robots.txt file in action, it is a well known name for search engine spiders and they will look it to the file to check if any barrier is set on the site for them. If no robots.txt file created it will end to an error 404 page. The error will appear to spiders and they may report it as a broken link.

1) Here's a basic "robots.txt":

User-agent: *
Disallow: /



This is interesting- here we declare that crawlers in general should not crawl any parts of our site, EXCEPT for Google, which is allowed to crawl the entire site apart from /cgi-bin/ and /privatedir/. So the rules of specificity apply, not inheritance.