Googlebot


Owner of the robot : Google Inc.

Country : USA

Robot type : search engine

Description : Google is the most popular search engine. Thanks to the PageRank algorithm invented by its founders, Google made search engines really useful and became the market leader.

Google use several web robots to collect data from billions of web pages. Several Google services operate different robots. In order to limit unnecessary crawl activity, Google uses a technology called caching proxy: when any Google robot reads a page, it makes the contents of the page available to all Google services. The main benefit for webmasters is a reduction of bandwidth required by Google to crawl their sites. This also implies that the search engine uses information collected by other Google robots than Googlebot.

Be careful when using robots.txt to disallow Googlebot! Blocking Googlebot with the following code also blocks Googlebot-Image and Googlebot-Mobile.
User-agent: Googlebot
Disallow: /

    User Agent transmitted to the visited web server :

    • Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

    IP address range :

    • from 66.249.64.0 to 66.249.95.255 (googlebot.com)
      (last visit in April 2008)

    Access control options understood by the robot :

    • robots.txt
    • META NAME=”robots”
    • META NAME=”Googlebot”
    • rel=”nofollow”

    User Agent to use in the robots.txt file : Googlebot

    URL’s for more information :

    Our pages about other Google robots:

    One Response to “Googlebot”

    1. Pratheep says:

      Excellent! Wonderful information about all the spiders, I am bookmarking it now… :))

      Thanks again!

      Pratheep


     

    Leave a Reply