I have an interesting logfile to analyze. I have many web sites served under win 2003, iis6. These web sites are built under a specific management system and they are all listed under the same web site in IIS. So, they are all logged in one log file, seperated by host headers under cs(Referer) tags. Here are some lines from that file;
#Fields: date time s-ip cs-method cs-uri-stem cs-uri-query cs-username c-ip cs(User-Agent) cs(Referer) sc-status sc-bytes time-taken
2007-05-13 00:00:05 xxx.xxx.xxx.x
xx GET /img/abc/logo.gif – - xxx.
xxx.xxx.xxx Mozilla/4.0+(compatib
le;+MSIE+6.0;+Windows+NT+5.1;+SV1
) http: //www.abc.ddd.eee.ff/ 200
501 93
2007-05-13 00:00:34 xxx.xxx.xxx.x
xx GET /img/xyz/logo.gif – - xxx.
xxx.xxx.xxx Mozilla/4.0+(compatib
le;+MSIE+6.0;+Windows+NT+5.1;+SV1
) http: //www.xyz.ddd.eee.ff/ 200
3662 124
2007-05-13 00:00:52 xxx.xxx.xxx.x
xx GET /img/prs/logo.gif – - xxx.
xxx.xxx.xxx Mozilla/5.0+(Windows;
+U;+Windows+NT+5.1;+tr;+rv:1.8.1.
3)+Gecko/20070309+Firefox/2.0.0.3
http: //www.prs.ddd.eee.ff/ 200
1042 15
As you see, these sites are all in form of www.xxx.ddd.eee.ff/ and i need to get seperated statistics for these sites. Is there a way to do this? I think this log file must be analyzed regarding the cs(Referer) tag but I don't know how to do this. Any ideas?
The referrer field is not supposed to identify the web server. Visitors who type the address of the site in the address bar, visitors who hide the referrers, visitors who bookmarked the site, visitors who clicked a link on another web site, … will not show the referrer you expect.
You should add a field for the virtual host name in the log file. Then it will be easy. In IIS, it is called r-host or cs-host.
If you need to parse an existing log file without that information, you could use the SkipFiles parameter and list all the sites you do not want (assuming /abc/ is a folder reflecting the site name). Repeat for each, changing the parameter to remove the other sites.