Current User: Guest Login Register
Please consider registering


Register? | Lost Your Password?

Search Forums:


 






Minimum search word length is 3 characters – Maximum search word length is 84 characters
Wildcard Usage:
*  matches any number of characters    %  matches exactly one character

TV-Browser (TVB)

Reply to Post Add a New Topic
UserPost

8:39 pm
August 8, 2008


deerwood

New Member

posts 2

Hi all,

I'm serving files for http://tvbrowser.org and want to analyze traffic to the TVB dedicated download site with AWStats (6.7) as good as possible. The (many, about 15,000) downloadable files are mostly *.gz updated once a day via 'rsync –delete ….' (they are changing every day, old ones get deleted, new ones are coming in, existing ones for the next 4 weeks of TV channels are updated often).

Access to them is via HTTP GET requests from the TVB Java application, which identifies itself e.g. as useragent

  • TV-Browser 2.2.5 Java/1.4.2_12
  • TV-Browser 2.6 Java/1.6.0
  • TV-Browser 2.6.3 Java/1.6.0_07
  • TV-Browser 2.7 Java/1.6.0_03

depending on the TVB and Java version installed.

Main Problem is: I had to remove the 'java' catchall in "lib/robots.pm", else the statistics were totally wrong: TV-Browser was not considered as a normal browser but instead summed up under 'robots'. But for my site (and some few others) TV-Browser accesses are the main/only important browsers.

Simply adding 'tv\\-browser' in "lib/browsers.pm" (at all neccessary places) did not do the job alone, the 'java' catchall seems to have precedence. I also took a look into "awstats.pl", confess, that I didn't grasp most of it, but it seems to me, that the useragent string is checked againsts the robots.pm before it is checked against the browsers.pm?

So, am I left alone with this workaround? This would be a bad compromise, because AWStats runs on a virtual host serving several other domains … at least for the other domains, the 'java' catchall should be in place!

Hints, anybody?

Second Problem: I am interested to get statistics for the TVB version used (like e.g. for Msie, Firefox or SVN). But the version handling seems to be buried deep inside "awstats.pl" and not configurable at all?

Thanks for your time, the great AWStats software and all helpers here on this faszinating forum!

Best regards,

Georg

5:01 am
August 9, 2008


Jean-Luc

Admin

posts 1042

Hi Georg,

Did you try to replace the 'java' entry by '^java' in robots.pm ? This should solve your first problem.

Regarding the versions of the TV-Browser, you can get a detailed report by using an extra section. This is a not tested example:

ExtraSectionName1="TV Browsers"
ExtraSectionCodeFilter1="200 304"
ExtraSectionCondition1="URL,.*"
ExtraSectionFirstColumnTitle1="TVB"
ExtraSectionFirstColumnValues1="UA,
^(TV-Browser .*)$"
ExtraSectionFirstColumnFormat1="%s"
ExtraSectionStatTypes1=PHBL
ExtraSectionAddSumRow1=1
MaxNbOfExtra1=20
MinHitExtra1=1

6:14 pm
August 9, 2008


deerwood

New Member

posts 2

Hi Jean-Luc,

many thanks for your immediate and helpfull answer!

Did you try to replace the 'java' entry by '^java' in robots.pm ? This should solve your first problem.

Yes, that would be better. But I really would need a regular expression, that matches java in any position, unless it were preceded by tv-browser. I think I can eventually come up with a negated lookbehind assertion, e.g. something like '(?<!tv\\-browser).*java'. See perldoc perlretut. I'll test that and report success here.

Your extra section worked essentially like a charm, thanks for that. The only thing I changed is

ExtraSectionFirstColumnValues1=”UA,
^(TV-Browser [^J]*)J”

to get rid of the UA strings Java part, as I'm only interested in the TVB version, not in the combination of TVB and Java version (TVB itself is around in about 50 different versions). The regex looks for anything not beeing an upper case J after 'TV-Browser ' followed by a J. That final J even might be not neccessary, because * matches greedy, but I have to test that again.

Besides, did you know, that writing/editing in this forum is impossible with Opera? The links to start a post/edit don't work at all. No real problem, I used FireFox. But just to let you know.

Thanks and best regards,

Georg

4:32 am
August 10, 2008


Jean-Luc

Admin

posts 1042

Not sure why you need to pick up “Java” anywhere in the user agent as most Java robots have “Java” at the beginning of their user agent. Anyway, the negative lookbehind assertion is probably the way to go. Please let me know if it works as expected.

[off topic]

Thanks a lot for reporting the problem you had with Opera. I am posting this answer from my Opera 9.50 browser.

Posting from Opera – my preferred browser – seems to work fine here.

 [/off topic]

Reply to Post

Reply to Topic:
TV-Browser (TVB)

Guest Name (Required):

Guest Email (Required):

NOTE: New Posts are subject to administrator approval before being displayed

Smileys
Confused Cool Cry Embarassed Frown Kiss Laugh Smile Surprised Wink Yell
Post New Reply

Guest URL (required)

Math Required!
What is the sum of:
4 + 3
   


About the InternetOfficer.com Forum

Forum Timezone: UTC 1

Most Users Ever Online: 201

Currently Online:
22 Guests

Currently Browsing this Topic:
1 Guest

Forum Stats:

Groups: 2
Forums: 9
Topics: 497
Posts: 2378

Membership:

There are 2048 Members
There have been 245 Guests

There is 1 Admin
There is 1 Moderator

Top Posters:

cssfsu – 55
albert_newton – 30
deepakgupta – 28
cosminpana – 20
DTNMike – 19
ahtshun83 – 17

Recent New Members: Diggy, mopl91, henrique, ScottWigle, mgshirinda, andersondj

Administrators: Jean-Luc (1042 Posts)

Moderators: Jean-Luc (1042 Posts)