| User | Post |
|
5:33 am February 2, 2008
| John Bilicki III
Guest
| | | |
|
| |
|
|
When people visit my site they click on links that associate their
connection (0=dialup and 1=broadband). In the exaple access log there are 233 queries, 213 of which = 1 (and therefor 20
instances of connection=0). However AWStats only lists 14 hits with a value of '1' and never reports hits with a query value of '0'.
I have used the regex tester on this site, The Regulator (application which also allows testing), and all of my examples triggered correctly both here and in the program so I'm at a loss at the moment! Here is
what I have…
ExtraSectionName1="HTTP Query Requests"
ExtraSectionCodeFilter1="200 304"
ExtraSectionCondition1="URL,\/?"
ExtraSectionFirstColumnTitle1="Audio Preferences"
ExtraSectionFirstColumnValues1="QUERY_STRING,audio=([0-2])"
ExtraSectionFirstColumnFormat1="%s"
ExtraSectionStatTypes1=HBL
ExtraSectionAddAverageRow1=0
ExtraSectionAddSumRow1=1
MaxNbOfExtra1=20
MinHitExtra1=2
|
|
|
5:43 am February 2, 2008
| Jean-Luc
Admin
| | | |
|
| posts 220 |
|
|
Hi,
This is a known problem with AWStats extra sections. When the value in the first column is zero, the line does not appear in the report.
It will work if you add other characters, like here :
ExtraSectionFirstColumnValues1="QUERY_STRING,(audio=[0-2])"
Jean-Luc
|
|
|
7:17 am February 2, 2008
| John Bilicki III
Guest
| | | |
|
| |
|
|
Thanks Jean-Luc,
A fix and another bug…
Commenting this in the main script will allow the detection of zeros and I haven't noticed any slow down in the compilation of logs.
#delete $hashforselect->{0};
The next bug is that there are over 200 instances in the access log I'm using though the script only detects 21 instances of the connection query. Any idea why this might be happening?
|
|
|
7:38 am February 2, 2008
| Jean-Luc
Admin
| | | |
|
| posts 220 |
|
|
That's probably not another bug, but a configuration problem. Three things come to my mind:
1° If part of the log file was already processed, AWStats will not reprocress it when you run a new update. I mean that when you change the extra section, you need to erase the existing AWStats database files before you run a new update if you want that the change in extra section is effective for the complete log file.
2° Double-check the ExtraSectionCodeFilter1, ExtraSectionCondition1 and ExtraSectionFirstColumnValues1 to be sure that 200 records match these.
3° I never tested the fix you suggested. Let's hope it is not the source of the "second bug".
Jean-Luc
|
|
|
8:47 am February 2, 2008
| John Bilicki III
Guest
| | | |
|
| |
|
|
I always delete the logs AWStats generates when I need to test changes and recompile the logs via AWStats. I think my recyling bin is getting a little full actually!
Yes, the extra is taking HTTP code 200s. I haven't modified that.
Will the extra HTTP query detection work if the query isn't the only query (for example audio is another common query that connection will appear with unless someone manually changes the connection specifically which is rare).
Thanks for your help so far!
|
|
|
10:08 am February 2, 2008
| Jean-Luc
Admin
| | | |
|
| posts 220 |
|
|
Your extra section includes:
ExtraSectionCondition1="URL,\/?"
. . .
ExtraSectionFirstColumnValues1="QUERY_STRING,
audio=([0-2])"
It will accept the following URL's:
/?audio=1
/?connection=1&audio=2
/?audio=1&connection=2
/?audio=1&anything=else
/?anything=here&audio=1
It will not accept the following URL's:
/?connection=1
/index.php?audio=1
/something.here?audio=1
Jean-Luc
EDIT: not correct (see comment below)
|
|
|
11:10 am February 2, 2008
| John Bilicki III
Guest
| | | |
|
| |
|
|
ExtraSectionName2="HTTP Query Connection Preferences"
ExtraSectionCodeFilter2="200 304"
ExtraSectionCondition2="REFERER,\/?||URL,\/?||URLWITHQUERY,\/?"
ExtraSectionFirstColumnTitle2="Connection Preferences"
ExtraSectionFirstColumnValues2="REFERER,connection=([0-2])||QUERY_STRING,connection=([0-2])"
ExtraSectionFirstColumnFormat2="%s"
ExtraSectionStatTypes2=HBL
ExtraSectionAddAverageRow2=0
ExtraSectionAddSumRow2=1
MaxNbOfExtra2=20
MinHitExtra2=2
This seems to get a more accurate count however I have to spend more time confirming. AWStats detected 12 instances and I counted 11 manually. What do you think?
|
|
|
11:25 am February 2, 2008
| Jean-Luc
Admin
| | | |
|
| posts 220 |
|
|
This is confusing. Why do you mix up REFERRER and URL (or QUERY) within the same condition ? If you want both, it would probably be more clear to use two different extra sections, one showing a report about REFERRERS and the other about QUERY.
Jean-Luc
|
|
|
5:07 pm February 2, 2008
| John Bilicki III
Guest
| | | |
|
| |
|
|
Using only the following lines for an access log these lines include 9 instances of 'connection=0'…
65.208.187.194 - - [31/Jan/2008:18:43:28 +0000] "GET /home/?audio=1&connection=0 HTTP/1.1" 200 591
74.6.22.167 - - [31/Jan/2008:23:43:07 +0000] "GET /home/?audio=1&connection=0 HTTP/1.0" 200 579 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
74.6.22.102 - - [01/Feb/2008:11:18:27 +0000] "GET /home/home-news.php?audio=0&connection=0 HTTP/1.0" 200 9114 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
65.55.208.191 - - [01/Feb/2008:22:53:33 +0000] "GET /home/?audio=1&connection=0 HTTP/1.0" 200 1106 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)"
65.55.165.16 - - [01/Feb/2008:22:57:26 +0000] "GET /home/?audio=1&connection=0 HTTP/1.0" 200 1106
61.135.190.16 - - [02/Feb/2008:03:35:48 +0000] "GET /home/?audio=1&connection=0 HTTP/1.1" 200 1118 "-" "Baiduspider+(+http://www.baidu.com/search/spider.htm)"
61.135.190.16 - - [02/Feb/2008:04:11:52 +0000] "GET /home/home-news.php?audio=0&connection=0 HTTP/1.1" 200 40908 "-" "Baiduspider+(+http://www.baidu.com/search/spider.htm)"
77.91.224.5 - - [02/Feb/2008:04:13:32 +0000] "GET /home/?audio=1&connection=0 HTTP/1.1" 200 591 "-" "WebAlta Crawler/2.0 (http://www.webalta.net/ru/about_webmaster.html) (Windows; U; Windows NT 5.1; ru-RU)"
77.91.224.15 - - [02/Feb/2008:04:13:37 +0000] "GET /home/home-news.php?audio=0&connection=0 HTTP/1.1" 200 9114 "-" "WebAlta Crawler/2.0 (http://www.webalta.net/ru/about_webmaster.html) (Windows; U; Windows NT 5.1; ru-RU)"
Here is the highest detecting extra code I have…
ExtraSectionName2="HTTP Query Requests"
ExtraSectionCodeFilter2="200 304"
ExtraSectionCondition2="URLWITHQUERY,\/?"
ExtraSectionFirstColumnTitle2="connection Preferences"
ExtraSectionFirstColumnValues2="QUERY_STRING,(connection=[0-2])"
ExtraSectionFirstColumnFormat2="%s"
ExtraSectionStatTypes2=HBL
ExtraSectionAddAverageRow2=0
ExtraSectionAddSumRow2=1
MaxNbOfExtra2=20
MinHitExtra2=2
AWStats detects only six (6) instances. The only relatively close statisical guess I have is that seven (7) though I don't see any matching patterns there. I feel as though I'm staring directly at the issue but I really don't get AWStat's algorithems for figuring stuff out.
|
|
|
6:08 pm February 2, 2008
| Jean-Luc
Admin
| | | |
|
| posts 220 |
|
|
In your 9 examples, there are only 6 instances of
ExtraSectionCondition2="URLWITHQUERY,\/?"
as the other three don't contain a slash immediately followed by a questionmark.
I guess that, with
ExtraSectionCondition2="URLWITHQUERY,^.*$"
AWStats would detect 9 occurences.
Jean-Luc
EDIT: not correct (see comment below)
|
|
|
6:16 pm February 2, 2008
| John Bilicki III
Guest
| | | |
|
| |
|
|
There are a total of 8 instances. By default AWStats counts 7. When I remove any line it adds up to 6. However after testing the log by deleting a single line and testing no single line removed keeps the count at 7.
65.208.187.194 - - [01/Feb/2008:11:18:25 +0000] "GET /home/?audio=1&connection=0 HTTP/1.1" 200 591 "-" "msnbot/1.0 (+http://search.msn.com/msnbot1.htm)"
74.6.22.167 - - [01/Feb/2008:11:18:26 +0000] "GET /home/?audio=1&connection=0 HTTP/1.0" 200 579 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
74.6.22.102 - - [01/Feb/2008:11:18:27 +0000] "GET /home/home-news.php?audio=0&connection=0 HTTP/1.0" 200 9114 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
65.55.208.191 - - [01/Feb/2008:22:53:33 +0000] "GET /home/?audio=1&connection=0 HTTP/1.0" 200 1106 "-" "msnbot/1.0 (+http://search.msn.com/msnbot2.htm)"
65.55.165.16 - - [01/Feb/2008:22:57:26 +0000] "GET /home/?audio=1&connection=0 HTTP/1.0" 200 1106 "-" "msnbot/1.0 (+http://search.msn.com/msnbot3.htm)"
61.135.190.16 - - [02/Feb/2008:03:35:48 +0000] "GET /home/?audio=1&connection=0 HTTP/1.1" 200 1118 "-" "Baiduspider+(+http://www.baidu.com/search/spider.htm)"
77.91.224.5 - - [02/Feb/2008:04:13:32 +0000] "GET /home/?audio=1&connection=0 HTTP/1.1" 200 591 "-" "WebAlta Crawler/2.0 (http://www.webalta.net/ru/about_webmaster.html) (Windows; U; Windows NT 5.1; ru-RU)"
77.91.224.15 - - [02/Feb/2008:04:13:37 +0000] "GET /home/home-news.php?audio=0&connection=0 HTTP/1.1" 200 9114 "-" "WebAlta Crawler/2.0 (http://www.webalta.net/ru/about_webmaster.html) (Windows; U; Windows NT 5.1; ru-RU)"
This line doesn't seem to be detected…
65.55.165.16 - - [01/Feb/2008:22:57:26 +0000] "GET /home/?audio=1&connection=0 HTTP/1.0" 200 1106 "-" "msnbot/1.0 (+http://search.msn.com/msnbot3.htm)"
No other lines have the same IP, time, or user agent.
Other lines use the same HTTP method, HTTP version, HTTP code, bandwidth, and lack of referrer.
Can we maybe find something truly unique that we can duplicate?
Do you see anything truly unique that keeps this line from matching the rule?
|
|
|
6:42 pm February 2, 2008
| Jean-Luc
Admin
| | | |
|
| posts 220 |
|
|
Are you still talking about the same extra section ? I get the feeling that we are running after a moving target…
Note that in your previous example (with the 9 instances), two lines didn't have the referrer and user agent fields at the end. I am not sure that AWStats can accept these lines. They could be considered as corrupted records.
Jean-Luc
|
|
|
7:45 pm February 2, 2008
| John Bilicki III
Guest
| | | |
|
| |
|
|
Adding '1.php' between the slash and the query question mark has no effect with your filter (it still detects 7 instances).
Rerunning through those same lines with the 1.php added with the original rule had no effect either.
Here is what the lines look like with the 1.php strings added. Which lines look corrupt? The "-" are the referrers, blank though. What am I missing?
65.208.187.194 - - [01/Feb/2008:11:18:25 +0000] "GET /home/1.php?audio=1&connection=0 HTTP/1.1" 200 591 "-" "msnbot/1.0 (+http://search.msn.com/msnbot1.htm)"
74.6.22.167 - - [01/Feb/2008:11:18:26 +0000] "GET /home/1.php?audio=1&connection=0 HTTP/1.0" 200 579 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
74.6.22.102 - - [01/Feb/2008:11:18:27 +0000] "GET /home/home-news.php?audio=0&connection=0 HTTP/1.0" 200 9114 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
65.55.208.191 - - [01/Feb/2008:22:53:33 +0000] "GET /home/1.php?audio=1&connection=0 HTTP/1.0" 200 1106 "-" "msnbot/1.0 (+http://search.msn.com/msnbot2.htm)"
65.55.165.16 - - [01/Feb/2008:22:57:26 +0000] "GET /home/1.php?audio=1&connection=0 HTTP/1.0" 200 1106 "-" "msnbot/1.0 (+http://search.msn.com/msnbot3.htm)"
65.55.165.16 - - [01/Feb/2008:22:57:27 +0000] "GET /home/1.php?audio=1&connection=0 HTTP/1.0" 200 1106 "-" "msnbot/1.0 (+http://search.msn.com/msnbot3.htm)"
65.55.165.16 - - [01/Feb/2008:22:57:28 +0000] "GET /home/1.php?audio=1&connection=0 HTTP/1.0" 200 1106 "-" "msnbot/1.0 (+http://search.msn.com/msnbot3.htm)"
61.135.190.16 - - [02/Feb/2008:03:35:48 +0000] "GET /home/1.php?audio=1&connection=0 HTTP/1.1" 200 1118 "-" "Baiduspider+(+http://www.baidu.com/search/spider.htm)"
77.91.224.5 - - [02/Feb/2008:04:13:32 +0000] "GET /home/1.php?audio=1&connection=0 HTTP/1.1" 200 591 "-" "WebAlta Crawler/2.0 (http://www.webalta.net/ru/about_webmaster.html) (Windows; U; Windows NT 5.1; ru-RU)"
77.91.224.15 - - [02/Feb/2008:04:13:37 +0000] "GET /home/home-news.php?audio=0&connection=0 HTTP/1.1" 200 9114 "-" "WebAlta Crawler/2.0 (http://www.webalta.net/ru/about_webmaster.html) (Windows; U; Windows NT 5.1; ru-RU)"
|
|
|
8:11 pm February 2, 2008
| John Bilicki III
Guest
| | | |
|
| |
|
|
slick-willy made a suggestion that worked though when I went back to test against the original logs (including both rules from him and you) I got the same exact results.
Here are the lines and the two rules for BOTH audio and connection…
65.208.187.194 - - [31/Jan/2008:18:43:28 +0000] "GET /home/?audio=1&connection=0 HTTP/1.1" 200 591 "http://www.jabcreations.com/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
74.6.22.167 - - [31/Jan/2008:23:43:07 +0000] "GET /home/?audio=1&connection=0 HTTP/1.0" 200 579 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
74.6.22.102 - - [01/Feb/2008:11:18:27 +0000] "GET /home/home-news.php?audio=0&connection=0 HTTP/1.0" 200 9114 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
65.55.208.191 - - [01/Feb/2008:22:53:33 +0000] "GET /home/?audio=1&connection=0 HTTP/1.0" 200 1106 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)"
65.55.165.16 - - [01/Feb/2008:22:57:26 +0000] "GET /home/?audio=1&connection=0 HTTP/1.0" 200 1106 "http://search.live.com/results.aspx?q=creations&mrt=en-us&FORM=LIVSOP" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 1.1.4322)"
61.135.190.16 - - [02/Feb/2008:03:35:48 +0000] "GET /home/?audio=1&connection=0 HTTP/1.1" 200 1118 "-" "Baiduspider+(+http://www.baidu.com/search/spider.htm)"
61.135.190.16 - - [02/Feb/2008:04:11:52 +0000] "GET /home/home-news.php?audio=0&connection=0 HTTP/1.1" 200 40908 "-" "Baiduspider+(+http://www.baidu.com/search/spider.htm)"
77.91.224.5 - - [02/Feb/2008:04:13:32 +0000] "GET /home/?audio=1&connection=0 HTTP/1.1" 200 591 "-" "WebAlta Crawler/2.0 (http://www.webalta.net/ru/about_webmaster.html) (Windows; U; Windows NT 5.1; ru-RU)"
77.91.224.15 - - [02/Feb/2008:04:13:37 +0000] "GET /home/home-news.php?audio=0&connection=0 HTTP/1.1" 200 9114 "-" "WebAlta Crawler/2.0 (http://www.webalta.net/ru/about_webmaster.html) (Windows; U; Windows NT 5.1; ru-RU)"
The extra rules…
ExtraSectionName1="HTTP Query Requests"
ExtraSectionCodeFilter1="200 304"
ExtraSectionCondition1="URLWITHQUERY,\/?"
ExtraSectionFirstColumnTitle1="Audio Preferences"
ExtraSectionFirstColumnValues1="QUERY_STRING,
(audio=[0-2])"
ExtraSectionFirstColumnFormat1="%s"
ExtraSectionStatTypes1=HBL
ExtraSectionAddAverageRow1=0
ExtraSectionAddSumRow1=1
MaxNbOfExtra1=20
MinHitExtra1=2
ExtraSectionName2="HTTP Query Requests"
ExtraSectionCodeFilter2="200 304"
ExtraSectionCondition2="URLWITHQUERY,\/?"
ExtraSectionFirstColumnTitle2="connection Preferences"
ExtraSectionFirstColumnValues2="QUERY_STRING,
connection=([0-1])"
ExtraSectionFirstColumnFormat2="%s"
ExtraSectionStatTypes2=HBL
ExtraSectionAddAverageRow2=0
ExtraSectionAddSumRow2=1
MaxNbOfExtra2=20
MinHitExtra2=2
I get…
audio=1 = 4 hits detected of 6 (off by 2!)
audio=0 = 3 hits detected of 3 (GOOD!)
connection=0 7 hits detected of 9 (off by 2)
The results are the exact same when I use this…
="URLWITHQUERY,^.*$"
instead of this…
="URLWITHQUERY,\/?"
Thanks goodness adding sums is simple, this issue is numbing my mind! 
|
|
|
1:56 am February 3, 2008
| Jean-Luc
Admin
| | | |
|
| posts 220 |
|
|
My suggestion to replace ="URLWITHQUERY,\/?" by ="URLWITHQUERY,^.*$" was wrong. I should go back to school to learn regular expressions.
… but I know why you only see 7 hits: there are 7 hits in the February report and the 2 others are in the January report !
Jean-Luc
|
|
|
2:45 am February 3, 2008
| John Bilicki III
Guest
| | | |
|
| |
|
|
You've got a better comprehension of regular expressions then myself so don't sweat it. I failed to realize because I was so exhuasted towards the end of my night last night that yes, the "missing" hits were in January.
It looks fine now I think. I'll put up a private working example for January after I setup the restof the extra filters if you want. There are all sorts of interesting things I'm interested in that I was unaware that I could so easily track with AWStats.
"Trying is the first step towards failure." - Homer Simpson 
|
|