Internet Archive
Owner of the robot : The Internet Archive
Country : USA
Robot type : search engine
Description : The Internet Archive was founded in 1996 as a non-profit organization. It builds an Internet library of digital collections, for researchers, historians, and scholars. Many old versions of web pages are publicly available from their web site.
ia_archiver-web.archive.org is a web robot directly managed by the Internet Archive. This robot complements the data acquisitions done by ia_archiver, the Alexa robot. Alexa seems to be the main provider of information to the Internet Archive.
The “robots” META tag is not supported. Pages including the line below can be retrieved in the index of the Internet Archive :
<meta name=”robots” content=”noindex”>
User Agent transmitted to the visited web server :
- Mozilla/5.0 (compatible; heritrix/2.0.0-SNAPSHOT-20071129.030306 +http://crawler.archive.org)
IP address range :
- from 208.70.24.0 to 208.70.31.255 (archive.org)
(last visit in November 2007)
User Agent transmitted to the visited web server :
- ia_archiver-web.archive.org
IP address range :
- from 208.70.24.0 to 208.70.31.255 (archive.org)
(last visit in March 2008) - from 207.241.224.0 to 207.241.239.255 (archive.org)
(last visit in May 2008)
Access control options understood by the robot :
- robots.txt
User Agent to use in the robots.txt file : ia_archiver
URL’s for more information :
Our pages about other ia_archiver robots :