What’s the Robot Exclusion Standard?

Because they do have the potential to wreak havoc on a web site, there has to be some kind of
guidelines to keep crawlers in line. Those guidelines are called the Robot Exclusion Standard, Robots
Exclusion Protocol, or robots.txt.


The file robots.txt is the actual element that you’ll work with. It’s a text-based document that should
be included in the root of your domain, and it essentially contains instructions to any crawler that
comes to your site about what they are and are not allowed to index.
To communicate with the crawler, you need a specific syntax that it can understand. In its most
basic form, the text might look something like this:
User-agent: *
Disallow: /
These two parts of the text are essential. The first part, User-agent:, tells a crawler what user
agent, or crawler, you’re commanding. The asterisk (*) indicates that all crawlers are covered, but
you can specify a single crawler or even multiple crawlers.
The second part, Disallow:, tells the crawler what it is not allowed to access. The slash (/) indicates
“all directories.” So in the preceding code example, the robots.txt file is essentially saying that
“all crawlers are to ignore all directories.”
When you’re writing robots.txt, remember to include the colon (:) after the User-agent indicator
and after the Disallow indicator. The colon indicates that important information follows to which
the crawler should pay attention.
You won’t usually want to tell all crawlers to ignore all directories. Instead, you can tell all crawlers
to ignore your temporary directories by writing the text like this:
User-agent: *
Disallow: /tmp/
Or you can take it one step further and tell all crawlers to ignore multiple directories:
User-agent: *
Disallow: /tmp/
Disallow: /private/
Disallow: /links/listing.html
That piece of text tells the crawler to ignore temporary directories, private directories
Related Posts:
  • What SEO tools do you use? Keyword analysis tools, keyword density tools, index checking, backlink checking, wordprocessor to check spelling and grammar, HTML validation … Read More
  • sitemap seo for multiple languages IT  should include every page you want to be indexed by the search engines. If you have different URLs for different languages then those URLs must be included as well. … Read More
  • What is Anchor Text? It is the visible text that is hyper linked to another page ? … Read More
  • SEO 301 redirect limits 301 redirects are very useful if people access your site through several different URLs. For instance, your page for a given ID can be accessed in multiple ways http://yoursite.com/foo/ID http://yoursite.com/foo/ID/some-… Read More
  • What Is a Sitemap? A Sitemap is especially useful for sites that feature images that may be hard to find by Googlebot or contain pages that aren’t linked or have few included links. Sites that contain compelling information or content can a… Read More
  • what are meta tags? meta tags are labeled as meta keyword tags and meta descriptions tags. These tags are invisible to the user and are used to supply data or directions to the search engines in order to access information. … Read More
  • how get a list of all pages indexed by google? You can do site:example.com searches with  results per page, but if you  try to do so with a script, then you will be stopped after just a few requests. … Read More
  • 3 Top Internet Marketing Tips One Traffic Source Submit your Blog to Content Aggregators like AllTop.com and 9Rules. If you have a top-notch blog that regularly publishes valuable content (which you should), then you can submit your blog to sites like … Read More
  • What do you mean by keyword stemming? Keyword stemming is a useful tool for web pages and search engine optimization. The process of keyword stemming involves taking a basic but popular keyword pertaining to a particular website and adding a prefix, suffix, or p… Read More
  • Security issue for sitemap SEO? They typically name their sitemap something like sitemapRXTNAP.xml and submit it to Google using webmaster tools, rather than listing it in robots.txt … Read More
  • Who is Matt Cutts? Matt Cutts works for the Search Quality group in Google, specializing in search engine optimization issues. He is well known in the SEO community for enforcing the Google Webmaster Guidelines and cracking down on link spa… Read More
  • What does the 301 server response code signify? Moved Permanently … Read More
  • seo optimize a bunch of similar images try to give specific name of image in relation to what image shows to alternate keywords. creative ways to use the keywords that best describe the product.  … Read More
  • keyword proximity? The proximity of the keywords or keywords included in a phrase in relation to each other and one another is of great importance and is the basis of using the appropriate combination of the words in the page titles,headers,pa… Read More
  • Sitemap error on Google webmaster tools Need to provide an XML formatted sitemap file (sitemap.xml). <?xml version='1.0' encoding='utf-8' ?> <urlset xmlns='http://www.sitemaps.org/schemas/sitemap/0.9'> <url> <loc>http://myDomain.c… Read More