Optimize your robots.txt for Magento

It is important to create and optimize the robots.txt to make your Magento store secure and improve SEO.

The robots.txt ("robots dot text") is a text file that help Search engine robots (such as Google bot and Bing bot) to determine which information to index. By default there is no robots.txt in Magento Community or Enterprise distributive so you should create it yourself.

How robots.txt will improve your Magento?

This is just a few use-cases of robots.txt usage, so you will get a better idea why it is so important:

  • The robots.txt will help you to prevent duplicate content issues (it is very important for SEO).
  • It will hide technical information such as Errors logs, Reports, Core files, .SVN files etc from unexpected indexing (hackers will not be able to use Search engines to detect your platform and other information).

Robots.txt installation

Note: The robots.txt file covers one domain. For Magento websites with multiple domains or sub-domains, each domain/sub-domain (e.g. store.example.com and example.com) must have its own robots.txt file.

Magento Community and Magento Enterprise

Installation of robots.txt is easy. All you need is to create robots.txt file and copy the robots.txt code from our blog. Next, upload the robots.txt to the web root of your server, for example here: example.com/robots.txt.

If you will upload the robots.txt to sub-folder, e.g. example.com/store/robots.txt in this case robots.txt will be ignored by all search engines.

Magento Go

Installation of robots.txt for Magento Go is described in this Knowledge Base article.

Robots.txt for Magento

Here our recommended robots.txt code, please read the comments marked by # before robots.txt publishing:

## robots.txt for Magento Community and Enterprise

## GENERAL SETTINGS

## Enable robots.txt rules for all crawlers
User-agent: *

## Crawl-delay parameter: number of seconds to wait between successive requests to the same server.
## Set a custom crawl rate if you're experiencing traffic problems with your server.
# Crawl-delay: 30

## Magento sitemap: uncomment and replace the URL to your Magento sitemap file
# Sitemap: http://www.example.com/sitemap/sitemap.xml

## DEVELOPMENT RELATED SETTINGS

## Do not crawl development files and folders: CVS, svn directories and dump files
Disallow: /CVS
Disallow: /*.svn$
Disallow: /*.idea$
Disallow: /*.sql$
Disallow: /*.tgz$

## GENERAL MAGENTO SETTINGS

## Do not crawl Magento admin page
Disallow: /admin/

## Do not crawl common Magento technical folders
Disallow: /app/
Disallow: /downloader/
Disallow: /errors/
Disallow: /includes/
Disallow: /lib/
Disallow: /pkginfo/
Disallow: /shell/
Disallow: /var/

## Do not crawl common Magento files
Disallow: /api.php
Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /get.php
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /README.txt
Disallow: /RELEASE_NOTES.txt

## MAGENTO SEO IMPROVEMENTS

## Do not crawl sub category pages that are sorted or filtered.
Disallow: /*?dir*
Disallow: /*?dir=desc
Disallow: /*?dir=asc
Disallow: /*?limit=all
Disallow: /*?mode*

## Do not crawl 2-nd home page copy (example.com/index.php/). Uncomment it only if you activated Magento SEO URLs.
## Disallow: /index.php/

## Do not crawl links with session IDs
Disallow: /*?SID=

## Do not crawl checkout and user account pages
Disallow: /checkout/
Disallow: /onestepcheckout/
Disallow: /customer/
Disallow: /customer/account/
Disallow: /customer/account/login/

## Do not crawl seach pages and not-SEO optimized catalog links
Disallow: /catalogsearch/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/

## SERVER SETTINGS

## Do not crawl common server technical folders and files
Disallow: /cgi-bin/
Disallow: /cleanup.php
Disallow: /apc.php
Disallow: /memcache.php
Disallow: /phpinfo.php

## IMAGE CRAWLERS SETTINGS

## Extra: Uncomment if you do not wish Google and Bing to index your images
# User-agent: Googlebot-Image
# Disallow: /
# User-agent: msnbot-media
# Disallow: /

Test your robots.txt

After robots.txt publication your can check its syntax using these on-line tools:

Further reading