A little historyAlmost a year ago now I retro-fitted an app for a website with basic capabilities of tracking impressions and clicks of particular dynamic pages. I built it using 2 tables, impressions and clicks. Back then it was only regarded to be very basic so I wasn't too concerned about performance, just that it recorded every impression & click. With impressions I did a row per day per tracked page but I opted for a row per click for all clicks. The 2 tables now look like:CREATE TABLE [Place_Imp] ( [site] [int] NOT NULL , [list_id] [int] NOT NULL , [imp_date] [datetime] NOT NULL , [imp] [int] NOT NULL , CONSTRAINT [PK_Place_Imp] PRIMARY KEY CLUSTERED ( [site], [list_id], [imp_date] ) WITH FILLFACTOR = 90 ON [PRIMARY] ) ON [PRIMARY]GOCREATE TABLE [Place_Clicks] ( [site] [int] NOT NULL , [list_id] [int] NOT NULL , [type_id] [int] NOT NULL , [click_date] [datetime] NOT NULL , [user_ip] [nvarchar] (20) COLLATE Latin1_General_CI_AS NOT NULL ,) ON [PRIMARY]GO
I used to have a clustered primary key on Place_Click based on site, list_id, type_id and click_date but this had to be removed due performance issues related to what I was storing in the click_date. As I was storing every click I was storing them with their full datetime but when performance became an issue I had to drop the time portion of the clicks to get the preformance I needed for the application that uses these table to generate emails.The tables were about half the size they are now. Currently Place_Clicks has 422800 rows and climbing and Place_Imp has 616466. The problem back then was some how related to the CONVERT function. Basically I was trying to group the clicks by date so was using CONVERT to drop the time part to get the groupings to work. Something like this:select convert(nvarchar(10), click_date, 102) as date_click, list_id, count(user_ip) as clicksfrom place_clickswhere convert(nvarchar(10), click_date, 102) between convert(nvarchar(10), dateadd(d, -1, getdate()), 102) and convert(nvarchar(10), dateadd(d, -8, getdate()), 102)group by convert(nvarchar(10), click_date, 102), list_idorder by convert(nvarchar(10), click_date, 102), list_id
I have also tried this but this just gives me errors:select convert(nvarchar(10), click_date, 102) as clickdate, list_id, user_ip, count(user_ip) as clicksfrom clickswhere clickdate between convert(nvarchar(10), dateadd(d, -1, getdate()), 102) and convert(nvarchar(10), dateadd(d, -8, getdate()), 102)group by clickdate, list_id, user_iporder by clickdate, list_id
I wish I could find the orginal stored procedure that I used where this query was use in but to cut and all ready long story a little bit short, I ended up removing the time so I could easily group by date without having to do any conversion. I did try numerous things to save the time for just a problem I am now facing but couldn't justify anymore time to be spent on it. And this brings us upto date.The Current ProblemCurrently the sites that I monitor using this system seem to be getting hit very hard and we are getting what would be extremely odd results. I know it sounds were that I want to fix "stats" which technical is what I have been asked to do but showing customers the results I am currently getting pre-day would be suicide too and it does get queried a lot. So the plan is, if I can sort the problem with the data grouping would be to limit the number of clicks a user could do to something like 1 click per page per 15min interval and similarly with impressions. I know this wouldn't totally solve problem but it would make the PHBs happier.So the question is how do I group by date in a situation where I have both dates and times in a column without the horriable amount of overhead that the CONVERT function seems to be giving me? Any suggestions would be useful even if it is to totally start from scratch cause my table design is out of whack. I am using SQL Server 2000 for reference.Graham