| Author |
Topic |
|
SamC
White Water Yakist
3467 Posts |
Posted - 2002-06-14 : 10:03:12
|
| I'm faced with a requirement to track "files hit" on a website. The needed fields are:UserID, Time, URLThe goal would be post-mortem management reports on pages visited by users and time spent per page.This seems like a straightforward table to me - 3 fields, but there may be scalability and performance considerations. I'm sure others have thought about this problem before. I'd appreciate any feedback on issues I haven't considered, and design solutions that might address these issues.Thanks,Sam |
|
|
Merkin
Funky Drop Bear Fearing SQL Dude!
4970 Posts |
Posted - 2002-06-14 : 10:10:09
|
| The other thing you want to track maybe is referer.I do something like this on one of my sites. I have a stored proc that fires on every page.It can bog down when you try to report on 3 months worth of hits, so every month I send old data to an archive table. This keeps my log table small but lets me do a big report when needed.What sort of traffic are you expecting ?Damian |
 |
|
|
Page47
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2002-06-14 : 10:10:17
|
| Wow, what a coincidence that I am currently daydreaming about Indexed Views. You could create a view to directly support each of you reports comprised of the aggregations from you base tables. Then with the proper index on the view, you could vast improve the performance of your reporting.....uh.....at the expense of you data gathering.Maybe you could pump your data in real time into a staging table and then on a daily bases move it to your master table which would be the base table for you indexed view.I like it(maybe).<O> |
 |
|
|
SamC
White Water Yakist
3467 Posts |
Posted - 2002-06-14 : 13:14:26
|
| My problem is not as large as you may think.My webs are training classes, and the total users are usually around 10K to 40K users. Worst case, 50 to 100 URLs/user.This is such a small amount of data that it probably doesn't have much of a preformance hit. Seems to me that this problem should've been beaten to death by folks designing web hit analysis, but maybe they do that with flat files - not SQL??Sam |
 |
|
|
M.E.
Aged Yak Warrior
539 Posts |
Posted - 2002-06-14 : 13:33:55
|
quote: pages visited by users and time spent per page.
Major issue for M.E.... Time spent per page is actually the hardest thing to gather. How do you know when they goto a new page, when do you know they've closed the window and stopped viewing? -----------------------Take my advice, I dare ya |
 |
|
|
AjarnMark
SQL Slashing Gunting Master
3246 Posts |
Posted - 2002-06-14 : 14:07:29
|
quote: Seems to me that this problem should've been beaten to death by folks designing web hit analysis, but maybe they do that with flat files - not SQL??
BINGO! I've looked at a couple different web site statistics packages and they all do summary analysis on the IIS Log file (flat text file) to build their info.Due to the web's disconnected nature, the only way to determine "time spent per page" (as M.E. points out there are several issues) is to track that user from page to page and calculate the difference between when they hit page 1 to when they hit page 2. Unfortunately, you don't know when they close their browswer, so getting a time for the last page is really impossible. (However, if you have a login and logout and can get them to consistently use the logout then you at least have a "last page" for which the timing is not needed.) |
 |
|
|
jasper_smith
SQL Server MVP & SQLTeam MVY
846 Posts |
Posted - 2002-06-14 : 14:32:10
|
| We use Analysis Services to report on our Website activity and it works pretty well. We track pretty much everything you can ,particularly where people have come from (for our affiliate programs) , where they go on the sites , where they leave (pretty important too) and our own querystring codes so we can track marketing channel and effectiveness of various campaigns (mail/email/popups/banner ads etc). Every night we archive our session and transactional data for the day to our datawarehouse, cleanse and transform it and rebuild our cubes. We use Proclarity as the client tool although we have been playing with M$ Data Analyser.We only went live with it about 6 months ago but performance compared to SQL is very good on our datamarts (the're still pretty small though, only about 10 million rows in the largest one) |
 |
|
|
nlocklin
Yak Posting Veteran
69 Posts |
Posted - 2002-06-16 : 19:43:24
|
| If all you are doing is tracking hits (the fact that a user visited a page on your website), then an even better solution might be to import the IIS logfiles (or log from whatever web server you are using).IIS logfiles are just space-delimited, which works pretty well for importing in a DTS package. You can build a list of hits to skip over as well (images, for example) to just get your individual pages.The only downside to this is that you have to wait a day for a report (like WebTrends and other software). The upside is that the user isn't making a connection to your database while they're actually visiting the site, and you get all of the information that IIS tracks (like referrer, time, IP address, username if they log in through IIS, etc.).When I first started tracking visitors, I tried a stored procedure on each page. It got kind of annoying to manage, and now I just read in the IIS logfiles.--"I'm always doing that. I'm always messing up some mundane detail." |
 |
|
|
Merkin
Funky Drop Bear Fearing SQL Dude!
4970 Posts |
Posted - 2002-06-16 : 19:48:25
|
| Except that one of the few requirements is to track a database UserID. You are not going to find that in your IIS logs.Damian |
 |
|
|
uberbloke
Yak Posting Veteran
67 Posts |
Posted - 2002-06-17 : 04:02:23
|
| Dont forget that IIS logging (like session) results in an IIS overhead.I need to validate users and track 'em so I do it all in the same SP with one db hit per page, and I use my own cookie rather than the session object (so one day my app might even scale to more than one server), so I only rely on IIS for web and ASP serving and collect everything else that I need explicitly. |
 |
|
|
|