Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Tracking URLs

Author  Topic 

SamC
White Water Yakist

3467 Posts

Posted - 2002-06-14 : 10:03:12
I'm faced with a requirement to track "files hit" on a website. The needed fields are:
UserID, Time, URL

The goal would be post-mortem management reports on pages visited by users and time spent per page.

This seems like a straightforward table to me - 3 fields, but there may be scalability and performance considerations. I'm sure others have thought about this problem before. I'd appreciate any feedback on issues I haven't considered, and design solutions that might address these issues.

Thanks,

Sam

Merkin
Funky Drop Bear Fearing SQL Dude!

4970 Posts

Posted - 2002-06-14 : 10:10:09
The other thing you want to track maybe is referer.

I do something like this on one of my sites. I have a stored proc that fires on every page.

It can bog down when you try to report on 3 months worth of hits, so every month I send old data to an archive table. This keeps my log table small but lets me do a big report when needed.

What sort of traffic are you expecting ?

Damian
Go to Top of Page

Page47
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2002-06-14 : 10:10:17
Wow, what a coincidence that I am currently daydreaming about Indexed Views. You could create a view to directly support each of you reports comprised of the aggregations from you base tables. Then with the proper index on the view, you could vast improve the performance of your reporting.....uh.....at the expense of you data gathering.

Maybe you could pump your data in real time into a staging table and then on a daily bases move it to your master table which would be the base table for you indexed view.

I like it(maybe).

<O>
Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2002-06-14 : 13:14:26
My problem is not as large as you may think.

My webs are training classes, and the total users are usually around 10K to 40K users. Worst case, 50 to 100 URLs/user.

This is such a small amount of data that it probably doesn't have much of a preformance hit. Seems to me that this problem should've been beaten to death by folks designing web hit analysis, but maybe they do that with flat files - not SQL??

Sam

Go to Top of Page

M.E.
Aged Yak Warrior

539 Posts

Posted - 2002-06-14 : 13:33:55
quote:
pages visited by users and time spent per page.


Major issue for M.E.... Time spent per page is actually the hardest thing to gather. How do you know when they goto a new page, when do you know they've closed the window and stopped viewing?



-----------------------
Take my advice, I dare ya
Go to Top of Page

AjarnMark
SQL Slashing Gunting Master

3246 Posts

Posted - 2002-06-14 : 14:07:29
quote:
Seems to me that this problem should've been beaten to death by folks designing web hit analysis, but maybe they do that with flat files - not SQL??


BINGO! I've looked at a couple different web site statistics packages and they all do summary analysis on the IIS Log file (flat text file) to build their info.

Due to the web's disconnected nature, the only way to determine "time spent per page" (as M.E. points out there are several issues) is to track that user from page to page and calculate the difference between when they hit page 1 to when they hit page 2. Unfortunately, you don't know when they close their browswer, so getting a time for the last page is really impossible. (However, if you have a login and logout and can get them to consistently use the logout then you at least have a "last page" for which the timing is not needed.)

Go to Top of Page

jasper_smith
SQL Server MVP &amp; SQLTeam MVY

846 Posts

Posted - 2002-06-14 : 14:32:10
We use Analysis Services to report on our Website activity and it works pretty well. We track pretty much everything you can ,particularly where people have come from (for our affiliate programs) , where they go on the sites , where they leave (pretty important too) and our own querystring codes so we can track marketing channel and effectiveness of various campaigns (mail/email/popups/banner ads etc). Every night we archive our session and transactional data for the day to our datawarehouse, cleanse and transform it and rebuild our cubes. We use Proclarity as the client tool although we have been playing with M$ Data Analyser.
We only went live with it about 6 months ago but performance compared to SQL is very good on our datamarts (the're still pretty small though, only about 10 million rows in the largest one)

Go to Top of Page

nlocklin
Yak Posting Veteran

69 Posts

Posted - 2002-06-16 : 19:43:24
If all you are doing is tracking hits (the fact that a user visited a page on your website), then an even better solution might be to import the IIS logfiles (or log from whatever web server you are using).

IIS logfiles are just space-delimited, which works pretty well for importing in a DTS package. You can build a list of hits to skip over as well (images, for example) to just get your individual pages.

The only downside to this is that you have to wait a day for a report (like WebTrends and other software). The upside is that the user isn't making a connection to your database while they're actually visiting the site, and you get all of the information that IIS tracks (like referrer, time, IP address, username if they log in through IIS, etc.).

When I first started tracking visitors, I tried a stored procedure on each page. It got kind of annoying to manage, and now I just read in the IIS logfiles.


--
"I'm always doing that. I'm always messing up some mundane detail."
Go to Top of Page

Merkin
Funky Drop Bear Fearing SQL Dude!

4970 Posts

Posted - 2002-06-16 : 19:48:25
Except that one of the few requirements is to track a database UserID. You are not going to find that in your IIS logs.

Damian
Go to Top of Page

uberbloke
Yak Posting Veteran

67 Posts

Posted - 2002-06-17 : 04:02:23
Dont forget that IIS logging (like session) results in an IIS overhead.

I need to validate users and track 'em so I do it all in the same SP with one db hit per page, and I use my own cookie rather than the session object (so one day my app might even scale to more than one server), so I only rely on IIS for web and ASP serving and collect everything else that I need explicitly.



Go to Top of Page
   

- Advertisement -