Heritrix
Owner of the robot : Internet Archive
Country : USA
Robot type : search engine software
Description : Heritrix is the Internet Archive's open-source, extensible web crawler project.
User Agent transmitted to the visited web server :
- Mozilla/5.0 (compatible; heritrix/1.0 +http://metacarta.com)
IP address range : from to ()
URL for more information : http://crawler.archive.org/
Access control options understood by the robot :
- robots.txt
- META NAME=”robots”
User Agent to use in the robots.txt file :
Last visit of this robot logged in September 2005.
Other informations updated on October 19, 2005.