Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
LaurieCox
158 Posts |
Posted - 2014-07-22 : 15:39:21
|
So I have this table (table ddl and data inserts at end of post):Id ClientId StartDate EndDate----------- ----------- ---------- ----------1 208 2013-07-17 2014-07-162 208 2013-07-17 2014-07-163 21071 2014-05-22 2014-07-144 21071 2014-07-14 2015-07-135 21071 2014-04-03 2014-08-016 29116 2014-05-23 2015-05-227 29116 2014-06-12 2015-06-118 29116 2014-05-23 2015-05-229 62716 2014-06-23 2015-06-2210 62716 2014-06-08 2015-06-0711 62716 2014-05-23 2015-05-2212 90080 2014-05-21 2014-07-0213 90080 2014-07-03 2015-07-0214 90080 2014-05-21 2015-05-20 I want to find all pairs of records (for a given ClientId) where the date ranges overlap. I do not count it an overlap if Start Date of one record = End Date of another for the given ClientId.I have written this query:SELECT A.ClientId , A.Id as AId , B.Id as BId , A.StartDate as AStartDate , A.EndDate as AEndDate , B.StartDate as BStartDate , B.EndDate as BEndDate FROM TestDates a join TestDates b on a.ClientId = b.ClientId and a.Id <> b.Id and a.StartDate >= b.StartDate and a.StartDate < b.EndDate Which I think works. It finds the overlapping rows for all of the test cases I could think of at least. It gives me this result from the test data (row column added for reference):row ClientId ARowId BRowId AStartDate AEndDate BStartDate BEndDate---- ----------- ----------- ----------- ---------- ---------- ---------- ----------A 208 1 2 2013-07-17 2014-07-16 2013-07-17 2014-07-16B 208 2 1 2013-07-17 2014-07-16 2013-07-17 2014-07-16C 21071 3 5 2014-05-22 2014-07-14 2014-04-03 2014-08-01D 21071 4 5 2014-07-14 2015-07-13 2014-04-03 2014-08-01E 29116 6 8 2014-05-23 2015-05-22 2014-05-23 2015-05-22F 29116 7 6 2014-06-12 2015-06-11 2014-05-23 2015-05-22G 29116 7 8 2014-06-12 2015-06-11 2014-05-23 2015-05-22H 29116 8 6 2014-05-23 2015-05-22 2014-05-23 2015-05-22I 62716 9 10 2014-06-23 2015-06-22 2014-06-08 2015-06-07J 62716 9 11 2014-06-23 2015-06-22 2014-05-23 2015-05-22K 62716 10 11 2014-06-08 2015-06-07 2014-05-23 2015-05-22L 90080 12 14 2014-05-21 2014-07-02 2014-05-21 2015-05-20M 90080 13 14 2014-07-03 2015-07-02 2014-05-21 2015-05-20N 90080 14 12 2014-05-21 2015-05-20 2014-05-21 2014-07-02 The problem is on some of the overlapping rows it gives me both permutations. For example for ClientId 29116 row ClientId AId BId AStartDate AEndDate BStartDate BEndDate--- ----------- ----------- ----------- ---------- ---------- ---------- ----------E 29116 6 8 2014-05-23 2015-05-22 2014-05-23 2015-05-22F 29116 7 6 2014-06-12 2015-06-11 2014-05-23 2015-05-22G 29116 7 8 2014-06-12 2015-06-11 2014-05-23 2015-05-22H 29116 8 6 2014-05-23 2015-05-22 2014-05-23 2015-05-22 Row E and row H are really showing the same overlap of records 6 and 8 so I only want one of them in my result set. Note: I bolded the duplicates in the test data above. I only want one of each pair in my result set. I do not care which one.I did notice that the ones that give me both permutations the Start Dates are the same. I am not sure if this would always be the case. I first (stupidly) added "and a.StartDate <> b.StartDate" and then immediately realized that that got rid of both rows.So how to I eliminate the duplicated overlaps? And though my code worked on all the test cases I could think I am not 100% confident it finds all possible overlaps. Am I missing anything?Thanks,Lauriecreate table TestDates (Id int IDENTITY(1, 1) ,ClientId int ,StartDate Date ,EndDate Date) INSERT INTO TestDatesSelect 208, '2013-07-17', '2014-07-16' union allSelect 208, '2013-07-17', '2014-07-16' union allSelect 21071, '2014-05-22', '2014-07-14' union allSelect 21071, '2014-07-14', '2015-07-13' union allSelect 21071, '2014-04-03', '2014-08-01' union allSelect 29116, '2014-05-23', '2015-05-22' union allSelect 29116, '2014-06-12', '2015-06-11' union allSelect 29116, '2014-05-23', '2015-05-22' union allSelect 62716, '2014-06-23', '2015-06-22' union allSelect 62716, '2014-06-08', '2015-06-07' union allSelect 62716, '2014-05-23', '2015-05-22' union allSelect 90080, '2014-05-21', '2014-07-02' union allSelect 90080, '2014-07-03', '2015-07-02' union allSelect 90080, '2014-05-21', '2015-05-20' |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2014-07-22 : 16:33:36
|
[code]and a.StartDate <= b.EndDate and a.EndDate >= b.StartDate[/code] Microsoft SQL Server MVP, MCT, MCSE, MCSA, MCP, MCITP, MCTS, MCDBA |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
|
LaurieCox
158 Posts |
Posted - 2014-07-23 : 09:58:57
|
Hi SwePeso,Thanks for your reply. It does not seem to do what I want. When I add those two statements to my existing query it gives me the same results (i.e. it does not remove one of the duplicate overlaps).Modified query:SELECT A.ClientId , A.Id as ARowId , B.Id as BRowId , A.StartDate as AStartDate , A.EndDate as AEndDate , B.StartDate as BStartDate , B.EndDate as BEndDate FROM TestDates a join TestDates b on a.ClientId = b.ClientId and a.Id <> b.Id and a.StartDate >= b.StartDate and a.StartDate < b.EndDate and a.StartDate <= b.EndDate and a.EndDate >= b.StartDate Results:ClientId ARowId BRowId AStartDate AEndDate BStartDate BEndDate----------- ----------- ----------- ---------- ---------- ---------- ----------208 2 1 2013-07-17 2014-07-16 2013-07-17 2014-07-16208 1 2 2013-07-17 2014-07-16 2013-07-17 2014-07-1621071 3 5 2014-05-22 2014-07-14 2014-04-03 2014-08-0121071 4 5 2014-07-14 2015-07-13 2014-04-03 2014-08-0129116 7 6 2014-06-12 2015-06-11 2014-05-23 2015-05-2229116 8 6 2014-05-23 2015-05-22 2014-05-23 2015-05-2229116 6 8 2014-05-23 2015-05-22 2014-05-23 2015-05-2229116 7 8 2014-06-12 2015-06-11 2014-05-23 2015-05-2262716 9 10 2014-06-23 2015-06-22 2014-06-08 2015-06-0762716 9 11 2014-06-23 2015-06-22 2014-05-23 2015-05-2262716 10 11 2014-06-08 2015-06-07 2014-05-23 2015-05-2290080 14 12 2014-05-21 2015-05-20 2014-05-21 2014-07-0290080 12 14 2014-05-21 2014-07-02 2014-05-21 2015-05-2090080 13 14 2014-07-03 2015-07-02 2014-05-21 2015-05-20 The rows spit out in a slightly different order but as you can see it still includes the duplicate (row permutation) overlaps. For example for ClientId 29116 it still includes overlap rows showing 8 overlaps 6 and 6 overlaps 8 (bolded above). As these are functionally the same overlap I only need one in the result set. I do not care which one but I only want one.Unfortunately I have been juggling some stuff and well not have time this morning to read the thread in the link in thoroughly but I did glance at it and noted that one of posts showed only using the clause you suggested so I ran this query:SELECT A.ClientId , A.Id as ARowId , B.Id as BRowId , A.StartDate as AStartDate , A.EndDate as AEndDate , B.StartDate as BStartDate , B.EndDate as BEndDate FROM TestDates a join TestDates b on a.ClientId = b.ClientId and a.Id <> b.Id --and a.StartDate >= b.StartDate --and a.StartDate < b.EndDate and a.StartDate <= b.EndDate and a.EndDate >= b.StartDate And got this result:ClientId ARowId BRowId AStartDate AEndDate BStartDate BEndDate----------- ----------- ----------- ---------- ---------- ---------- ----------208 2 1 2013-07-17 2014-07-16 2013-07-17 2014-07-16208 1 2 2013-07-17 2014-07-16 2013-07-17 2014-07-1621071 4 3 2014-07-14 2015-07-13 2014-05-22 2014-07-1421071 5 3 2014-04-03 2014-08-01 2014-05-22 2014-07-1421071 3 4 2014-05-22 2014-07-14 2014-07-14 2015-07-1321071 5 4 2014-04-03 2014-08-01 2014-07-14 2015-07-1321071 3 5 2014-05-22 2014-07-14 2014-04-03 2014-08-0121071 4 5 2014-07-14 2015-07-13 2014-04-03 2014-08-0129116 7 6 2014-06-12 2015-06-11 2014-05-23 2015-05-2229116 8 6 2014-05-23 2015-05-22 2014-05-23 2015-05-2229116 6 7 2014-05-23 2015-05-22 2014-06-12 2015-06-1129116 8 7 2014-05-23 2015-05-22 2014-06-12 2015-06-1129116 6 8 2014-05-23 2015-05-22 2014-05-23 2015-05-2229116 7 8 2014-06-12 2015-06-11 2014-05-23 2015-05-2262716 10 9 2014-06-08 2015-06-07 2014-06-23 2015-06-2262716 11 9 2014-05-23 2015-05-22 2014-06-23 2015-06-2262716 9 10 2014-06-23 2015-06-22 2014-06-08 2015-06-0762716 11 10 2014-05-23 2015-05-22 2014-06-08 2015-06-0762716 9 11 2014-06-23 2015-06-22 2014-05-23 2015-05-2262716 10 11 2014-06-08 2015-06-07 2014-05-23 2015-05-2290080 14 12 2014-05-21 2015-05-20 2014-05-21 2014-07-0290080 14 13 2014-05-21 2015-05-20 2014-07-03 2015-07-0290080 12 14 2014-05-21 2014-07-02 2014-05-21 2015-05-2090080 13 14 2014-07-03 2015-07-02 2014-05-21 2015-05-20 Which is actually adding more permutations for the matches then the original query had.Anyway when I have the time I will look more closely at the script in the other thread.Laurie |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2014-07-23 : 10:03:24
|
I've put the date filter on the WHERE clause for brevitySELECT A.ClientId, A.Id as AId, B.Id as BId, A.StartDate as AStartDate, A.EndDate as AEndDate, B.StartDate as BStartDate, B.EndDate as BEndDateFROM dbo.TestDates AS aINNER JOIN dbo.TestDates AS b ON b.ClientId = a.ClientId AND b.Id > b.IdWHERE a.StartDate <= b.EndDate AND a.EndDate >= b.StartDate; Microsoft SQL Server MVP, MCT, MCSE, MCSA, MCP, MCITP, MCTS, MCDBA |
|
|
LaurieCox
158 Posts |
Posted - 2014-07-23 : 12:36:29
|
quote: Originally posted by SwePeso I've put the date filter on the WHERE clause for brevitySELECT A.ClientId, A.Id as AId, B.Id as BId, A.StartDate as AStartDate, A.EndDate as AEndDate, B.StartDate as BStartDate, B.EndDate as BEndDateFROM dbo.TestDates AS aINNER JOIN dbo.TestDates AS b ON b.ClientId = a.ClientId AND b.Id > b.IdWHERE a.StartDate <= b.EndDate AND a.EndDate >= b.StartDate; Microsoft SQL Server MVP, MCT, MCSE, MCSA, MCP, MCITP, MCTS, MCDBA
So I ran your query and got zero rows. But I really had to work on something else so I figured out I would come back to it later.Then I realized I really needed to it to complete what I was doing and figured I would work on it. So I replaced my two 'on' statements with your where clause and ran it against my real data. I got a bunch of rows.So I looked more closely at your query and saw that you had changed my a.Id <> b.id to b.Id > b.Id and a light bulb went off.So I corrected your typo to a.Id > b.Id and it worked. I really should have tried that before I even created this thread because the solution is obvious. I was comparing a to b and b to a. Of course I am going to get a row for each comparison.Thank you very much,Laurie |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2014-07-23 : 14:52:51
|
Great you got it working. Microsoft SQL Server MVP, MCT, MCSE, MCSA, MCP, MCITP, MCTS, MCDBA |
|
|
LaurieCox
158 Posts |
Posted - 2014-08-01 : 09:53:13
|
It turns out there are open ended ranges (i.e. EndDate is null). So I modified the query to this:SELECT A.ClientId, A.Id as AId, B.Id as BId, A.StartDate as AStartDate, A.EndDate as AEndDate, B.StartDate as BStartDate, B.EndDate as BEndDateFROM dbo.TestDates AS aINNER JOIN dbo.TestDates AS b ON b.ClientId = a.ClientId AND a.Id > b.IdWHERE (a.StartDate <= b.EndDate AND a.EndDate >= b.StartDate) or (a.EndDate is null and b.EndDate is null) or (a.EndDate is null and a.StartDate <= b.EndDate) or (b.EndDate is null and b.StartDate <= a.EndDate) I ran it against some test cases (see end of post for data) and it seemed to work.Questions:I have I missed any test cases?(because I always want to learn) Is there a better way to check for open ended ranges?Thanks,LaurieTest data:INSERT INTO TestDatesselect 100, '2014-01-01', null union all --overlapselect 100, '2015-01-01', null union allselect 200, '2014-02-05', null union all --overlapselect 200, '2013-01-01','2014-03-01' union allselect 300, '2014-02-05', null union all --no overlapselect 300, '2013-01-01', '2013-03-01' union allselect 400, '2014-01-01', '2015-01-01' union all --overlapselect 400, '2013-01-01', null union allselect 500, '2014-01-01', '2015-01-01' union all --no overlapselect 500, '2015-01-02', null |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2014-08-01 : 11:54:00
|
[code]SELECT A.ClientId, A.Id as AId, B.Id as BId, A.StartDate as AStartDate, A.EndDate as AEndDate, B.StartDate as BStartDate, B.EndDate as BEndDateFROM dbo.TestDates AS aINNER JOIN dbo.TestDates AS b ON b.ClientId = a.ClientId AND b.Id > a.IdWHERE a.StartDate <= ISNULL(b.EndDate, '99991231') AND ISNULL(a.EndDate, '99991231') >= b.StartDate;[/code] Microsoft SQL Server MVP, MCT, MCSE, MCSA, MCP, MCITP, MCTS, MCDBA |
|
|
LaurieCox
158 Posts |
Posted - 2014-08-01 : 13:00:50
|
Hi SwePeso,That makes sense. Should have thought of it myself.Thanks,Laurie |
|
|
|
|
|
|
|