Pages Crawled by Googlebot
AWStats tracks Googlebot and many other web robots. It shows the number of hits coming from Googlebot, but it does not show the pages that have been crawled by Googlebot and when this happened.
The extra section below lists the URL’s that have been visited by Googlebot, the number of times they have been visited and the date and time of the last visit.
You can add the Extra Section below at the end of your awstats.your-domain-name.conf configuration file.
New Report
This AWStats screen has been simulated for better readability.
Extra Section
ExtraSectionName1="Pages crawled by Googlebot"
ExtraSectionCodeFilter1="200 304"
ExtraSectionCondition1="UA,^Mozilla\/5\.0 \(compatible\; Googlebot\/2\.1\; \+http\:\/\/www\.google\.com\/bot\.html\)$"
ExtraSectionFirstColumnTitle1="URL"
ExtraSectionFirstColumnValues1="URL,^(.*)$"
ExtraSectionFirstColumnFormat1="<A HREF='%s' TARGET='_blank'>%.80s</A>"
ExtraSectionStatTypes1=HL
ExtraSectionAddSumRow1=1
MaxNbOfExtra1=20
MinHitExtra1=1
The ExtraSectionCondition1 can be edited to detect the hits from other web robots. For Yahoo! Slurp, use :
ExtraSectionCondition1="UA,^Mozilla\/5\.0 \(compatible\; Yahoo\! Slurp\; http\:\/\/help\.yahoo\.com\/help\/us\/ysearch\/slurp\)$"
MaxNbOfExtra1 is the number of lines that will be listed. Use the most convenient value.
April 8th, 2008 at 3:34 pm
Hello,
thanks for the info. How do I get my host (hostgator) to implement this code so I know what pages googlebot crawled?
mark
November 11th, 2008 at 9:33 pm
Thanks, this is exactly what I was looking for! Too bad it’s not included by default.
BTW, how would you modify the Extra Section code to report Totals for both the Hits and the URLs columns? I’d like to know how many unique pages have been crawled without counting them…
April 14th, 2009 at 6:17 pm
Hello,
The Yahoo! Slurp code only reports hits on robots.txt. It reports all hits when changed to:
ExtraSectionCondition1="UA,^Mozilla\/5\.0 \(compatible\; Yahoo\! Slurp\; http\:\/\/help\.yahoo\.com\/help\/us\/ysearch\/slurp\)$||UA,^Mozilla\/5\.0 \(compatible\; Yahoo\! Slurp\/3\.0\; http\:\/\/help\.yahoo\.com\/help\/us\/ysearch\/slurp\)$"
Regards,
Frank
April 21st, 2009 at 11:00 pm
I am having trouble with the code above. The section does show up in my main awstats.MySite.html but no pagehits are shown. I searched through my log which contains over 100 hits from googlebot. What are my options to search for the error? Does AWstats provide any log files?
Regards,
Jacob
Question moved to our forum about Extra Sections: trouble with pages crawled by Googlebot.