evolution search marketing - logo   we are evolution search marketing - find out how to make the most of your robots.txt file or robots meta tag and how to create them. this is a brief overview on robots text files, this and other SEO tips are available on our site, along with expert consultancy services.
 

How to create and use the robots.txt file & robots meta tags-

Robots Text Configuration -

 

The robots.txt file is a means to keep search engines out of a site or specific sections of a site. This is useful if there are sections of the site one would rather not have indexed (though secure password protection is much more reliable).

 

Often times, as a site is being redesigned, developers will implement a robots.txt to keep engines out during development. The problem is, sometimes this is forgotten about and not removed when the site goes live. If a site is not getting indexed check for a robots.txt file and make sure the engines are being allowed in. A site without a robots txt file will be (by default) indexable by engines when discovered.

 

Using robots tags on individual pages -

 

A variation on a robots txt file is a Meta robots tag, where parameters are determined on each page uniquely.
A basic Meta robots tag is placed within the <head> part of the pages actual source code and would read:

<meta name="Robots" content="NOINDEX ,NOFOLLOW" />

 

In this instance it is telling all search engines to not index the content of, or even follow links on this page.
Should there be particular pages one would NOT want to be indexed by the engines; it is safest to put the Meta Robots tag on the page itself also. However, when both the Robots.txt file on the server and the Meta robots tag clash, the latter is overwritten.

 

Using sitewide robots txt files -

 

Like htaccess files, using notepad to create the file is usually the best bet.

 

To generate a specific robots txt file, you can also view Microsoft's guidelines at http://support.microsoft.com/default.aspx?scid=KB;en-us;q217103

Once the file created, upload to the root of your server (usually same folder as your homepage).

 

For other site examples -

 

Currently many sites show theirs at nameofsite.com/robots.txt - such as even google.com/robots.txt

Which lists a lot of folders that they dont want their (or anyone elses engine) indexing.

 

the first line reads - User-agent: * - which is telling all search engine spiders (using * as "user agent" means "all"), to allow or disallow go on pages within the list of folders (regardless of what the pages themselves may have on their coding).

 

The safest way to block content is to set the robots file before the page goes live (as once it gets indexed, it may still linger around in data centres even after you block it). And ideally, one should insert a line for each page that they want a rule to apply for.

 

Example:

User-agent: *
Disallow: /iwanttoblockthispage.html
Disallow: /privatefolder/

 

However, be careful not to put -

User-agent: *
Disallow: /

 

Because this snippet of coding above will tell the robots to not index any part of the site.

 

Lastly, you can set the robots.txt file to block certain engines, or even certain file types, for instance you may want the engine not to index any of your sites PowerPoint presentations.

 

Disallow: /*.ppt$ # disallow access to PowerPoint Presentations

 

The same works for all other common formats too.

 

Find out this and other ways to make your site index well on search engines via the services we provide at evolution search marketing. Contact us with any questions you may have.

 

Tag this page and use it for resource. share with your friends on....
 del.icio.us  Digg    Newsvine    Reddit    MyYahoo!    Facebook

or Sphinn it for the internet marketing community