Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
GregDDDD
Posting Yak Master
120 Posts |
Posted - 2012-06-27 : 15:08:18
|
I have two tables. One tracks people as the change categories. You are given the date they changed and the person_id. So in the example table, person_id was in category_1 from 7/1/2011 to 7/12/2011 and then on 7/13/2011 they changed to category_2. Then on 7/26/2011 they changed back to category_1 and remained there for the rest of the month. In the second table I have the same person_id with a list of activities in the month. I need to count the activities in the month for the person, grouping by category. The category is not stored with the activities. I’m not concerned with the type of activity.In the example, person_id 10002 would have 6 activities while they were in category_1 and 3 activities while they were in category_2. I’m unsure how to deal with the dates like this. If I had the start and end date on a row for the category I could do it. And no, this is not homework. Thanks, Gregdeclare @tbl_dates TABLE ( [person_id] [int] NOT NULL, [category_name] varchar(30) NOT NULL, [start_date] [datetime] NULL)declare @tbl_activities TABLE ( [activity_id] [int] NOT NULL, [person_id] [int] NOT NULL, [activity_date] [datetime] NULL)insert into @tbl_datesvalues (1002, 'category_1', '7/1/2011')insert into @tbl_datesvalues (1002, 'category_2', '7/13/2011')insert into @tbl_datesvalues (1002, 'category_1', '7/26/2011') insert into @tbl_activitiesvalues (1, 1002, '7/5/2011')insert into @tbl_activitiesvalues (23, 1002, '7/6/2011')insert into @tbl_activitiesvalues (23, 1002, '7/6/2011')insert into @tbl_activitiesvalues (2, 1002, '7/6/2011')insert into @tbl_activitiesvalues (2, 1002, '7/13/2011')insert into @tbl_activitiesvalues (8, 1002, '7/13/2011')insert into @tbl_activitiesvalues (2, 1002, '7/21/2011')insert into @tbl_activitiesvalues (5, 1002, '7/28/2011')insert into @tbl_activitiesvalues (8, 1002, '7/29/2011')select * from @tbl_datesselect * from @tbl_activities |
|
Lamprey
Master Smack Fu Yak Hacker
4614 Posts |
Posted - 2012-06-27 : 15:52:50
|
I think this'll work:SELECT activites.person_id, dates.category_name, COUNT(*) AS row_countFROM @tbl_activities AS activitesINNER JOIN ( SELECT A.person_id, A.category_name, a.start_date, COALESCE(end_date, DATEADD(MONTH, DATEDIFF(MONTH, 0, A.start_date) + 1, 0) - 1) end_date FROM @tbl_dates AS A OUTER APPLY ( SELECT MIN(start_date) AS end_date FROM @tbl_dates WHERE A.start_date < start_date ) AS B ) AS dates ON activites.activity_date >= dates.start_date and activites.activity_date < dates.end_dateGROUP BY activites.person_id, dates.category_name |
|
|
TG
Master Smack Fu Yak Hacker
6065 Posts |
Posted - 2012-06-27 : 16:30:16
|
And a different approach. I think it should have fewer reads:---------------------------CTEs;with d1 as (select person_id , category_name , [start_date] , row_number() over (partition by person_id order by [start_date]) rnfrom @tbl_dates ),d2 as (select coalesce(a.person_id, b.person_id) person_id ,coalesce(a.[category_name], b.[category_name]) category_name ,coalesce(a.[start_date], convert(datetime, 0)) [start_date] ,coalesce(b.[start_date], convert(datetime, 99999)) [end_date]from d1 afull outer join d1 b on b.person_id = a.person_id and b.rn = a.rn + 1)---------------------------select out resultsselect d.person_id ,d.category_name ,count(a.activity_id) activity_countfrom d2 djoin @tbl_activities a on a.person_id = d.person_id and a.activity_date >= d.[start_date] and a.activity_date < d.[end_date]group by d.person_id ,d.category_nameOUTPUT:person_id category_name activity_count----------- ------------------------------ --------------1002 category_1 61002 category_2 3 Be One with the OptimizerTG |
|
|
GregDDDD
Posting Yak Master
120 Posts |
Posted - 2012-06-27 : 16:31:44
|
Man, I wish I knew this stuff better. I get the feeling I write a lot more code than I need to. I think you are very close, but it is difficult for me to debug. I'm not familiar with 'OUTER APPLY'. When I added another person_id and more activities for that person_id, the numbers didn't work. below is a new script with the additional values.Greginsert into @tbl_activitiesvalues (2, 1002, '7/13/2011')insert into @tbl_activitiesvalues (8, 1002, '7/13/2011')insert into @tbl_activitiesvalues (2, 1002, '7/21/2011') |
|
|
GregDDDD
Posting Yak Master
120 Posts |
Posted - 2012-06-27 : 16:39:46
|
Sorry, I must not have copied the entire script. Regardless TG's SQL seems to have worked. I added more values and it broke them out correctly. Very impressive. I will need to learn about 'coalesce'.Thank you,Greg |
|
|
Lamprey
Master Smack Fu Yak Hacker
4614 Posts |
Posted - 2012-06-27 : 16:45:47
|
What do expect to see as output? Those new dats all fall between 2011-07-13 and 2011-07-26 so they'd get associated with category_2. Unless you want to chage the date prediate from >= and < to > and <= ?? |
|
|
GregDDDD
Posting Yak Master
120 Posts |
Posted - 2012-06-27 : 16:57:28
|
Sorry, those new dates were an incomplete script. Below is what I meant to post the second time.Gregdeclare @tbl_dates TABLE ( [person_id] [int] NOT NULL, [category_name] varchar(30) NOT NULL, [start_date] [datetime] NULL)declare @tbl_activities TABLE ( [activity_id] [int] NOT NULL, [person_id] [int] NOT NULL, [activity_date] [datetime] NULL)insert into @tbl_datesvalues (1002, 'category_1', '7/1/2011')insert into @tbl_datesvalues (1002, 'category_2', '7/13/2011')insert into @tbl_datesvalues (1002, 'category_1', '7/26/2011')insert into @tbl_datesvalues (1003, 'category_1', '7/1/2011') insert into @tbl_activitiesvalues (1, 1002, '7/5/2011')insert into @tbl_activitiesvalues (23, 1002, '7/6/2011')insert into @tbl_activitiesvalues (23, 1002, '7/6/2011')insert into @tbl_activitiesvalues (2, 1002, '7/6/2011')insert into @tbl_activitiesvalues (2, 1002, '7/13/2011')insert into @tbl_activitiesvalues (8, 1002, '7/13/2011')insert into @tbl_activitiesvalues (2, 1002, '7/21/2011')insert into @tbl_activitiesvalues (5, 1002, '7/28/2011')insert into @tbl_activitiesvalues (8, 1002, '7/29/2011')insert into @tbl_activitiesvalues (8, 1002, '7/29/2011')insert into @tbl_activitiesvalues (2, 1003, '7/13/2011')insert into @tbl_activitiesvalues (8, 1003, '7/13/2011')insert into @tbl_activitiesvalues (2, 1003, '7/21/2011')insert into @tbl_activitiesvalues (2, 1003, '7/31/2011') |
|
|
TG
Master Smack Fu Yak Hacker
6065 Posts |
Posted - 2012-06-27 : 17:18:08
|
Based on the lastest data sample here are the results for both solutions - including some IO stats:TG:person_id category_name activity_count----------- ------------------------------ --------------1002 category_1 71003 category_1 41002 category_2 3Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, Table '#77CAB889'. Scan count 2, logical reads 2, physical reads 0, Table '#78BEDCC2'. Scan count 1, logical reads 1, physical reads 0, Lamprey:person_id category_name row_count----------- ------------------------------ -----------1002 category_1 111002 category_2 31003 category_2 3Table '#77CAB889'. Scan count 43, logical reads 56, physical reads 0Table '#78BEDCC2'. Scan count 1, logical reads 1, physical reads 0 Be One with the OptimizerTG |
|
|
|
|
|
|
|