Owner of the robot : Internet Archive

Country : USA

Robot type : search engine software

Description : Heritrix is the Internet Archive's open-source, extensible web crawler project.

User Agent transmitted to the visited web server :

  • Mozilla/5.0 (compatible; heritrix/1.0 +http://metacarta.com)


IP address range : from to ()

URL for more information : http://crawler.archive.org/

Access control options understood by the robot :

  • robots.txt
  • META NAME=”robots”

User Agent to use in the robots.txt file :

Last visit of this robot logged in September 2005.
Other informations updated on October 19, 2005.