Posted by: SEO Positive

Posted on: July 13, 2011 12:36 pm


If your website doesn’t currently have a robots.txt file, your site may not be ranking as highly as you’d like within the search engines. This tutorial will teach you how to set up the file.

The robots.txt file is especially important because it tells the spiders from any of the major search engines which pages to index and which pages to ignore. The file itself needs to be placed within your root directory as a simple text file (no need for HTML). A typical robots.txt file may look like the following example:

User-agent: *

Disallow: /

By inserting a *, you are telling the search engine bots that this command refers to all of them. You can also insert the specific name of a search engine crawler here, which would consequently make all of the commands relevant to that bot alone.
The / indicates that the bots are not to index any of the pages on the site (this isn’t desirable if you’re working hard on your SEO, of course).

There isn’t an ‘Allow ‘, so therefore the following file will enable all of the search engines to index every page on your site:

User-agent: *

Leaving the Disallow field blank indicates that there are no indexing restrictions.

Bear in mind that the commands should always be used in the order of ‘user-agent’>’disallow’, as any variations to this are unlikely to work.

You can also disallow the search engine spiders to index specific pages within your website, if you’d rather that some information wasn’t made available to the masses. A good example would be the following:

User-agent: *
Disallow: /example-page/
Disallow: /sub-example/
Disallow: /second-sub-example/

Take into consideration that the bots will not recognise a long-tailed line of subpages, so the following example would be incorrect:

User-agent: *
Disallow: /example-page/sub-example/second-subexample/