Pages Crawled by Googlebot


 

AWStats tracks Googlebot and many other web robots. It shows the number of hits coming from Googlebot, but it does not show the pages that have been crawled by Googlebot and when this happened.

The extra section below lists the URL’s that have been visited by Googlebot, the number of times they have been visited and the date and time of the last visit.

You can add the Extra Section below at the end of your awstats.your-domain-name.conf configuration file.

New Report

AWStats, pages crawled by Googlebot
This AWStats screen has been simulated for better readability.

Extra Section

ExtraSectionName1="Pages crawled by Googlebot"
ExtraSectionCodeFilter1="200 304"
ExtraSectionCondition1="UA,^Mozilla\/5\.0 \(compatible\; Googlebot\/2\.1\; \+http\:\/\/www\.google\.com\/bot\.html\)$"
ExtraSectionFirstColumnTitle1="URL"
ExtraSectionFirstColumnValues1="URL,^(.*)$"
ExtraSectionFirstColumnFormat1="<A HREF='%s' TARGET='_blank'>%.80s</A>"
ExtraSectionStatTypes1=HL
ExtraSectionAddSumRow1=1
MaxNbOfExtra1=20
MinHitExtra1=1

The ExtraSectionCondition1 can be edited to detect the hits from other web robots. For Yahoo! Slurp, use :

ExtraSectionCondition1="UA,^Mozilla\/5\.0 \(compatible\; Yahoo\! Slurp\; http\:\/\/help\.yahoo\.com\/help\/us\/ysearch\/slurp\)$"

MaxNbOfExtra1 is the number of lines that will be listed. Use the most convenient value.

One Response to “Pages Crawled by Googlebot”

  1. mark says:

    Hello,

    thanks for the info. How do I get my host (hostgator) to implement this code so I know what pages googlebot crawled?

    mark


 

Leave a Reply