Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Create ID that links records based on criteria

Author  Topic 

lines_michael
Starting Member

1 Post

Posted - 2014-08-06 : 23:40:22
I have a table of data with 10 fields and around 8,000 rows. I was asked to analyze the data in three different ways.

1. Identify when all 9 fields match identically (this excludes the ID field, as it is the PK).

Obviously finding the records wasn't a problem. I used a self join and row_number()over(partition by [10 fields] order by ...) to create a field that had matching numbers for each matching record.

2. Identify any time the first 3 fields (past the PK field) match and the StartDate and EndDate overlap at all.

I was able to identify these without any problem but am struggling with how to create a "linking" number like I got on the identical matches above. I can't partition because of the overlapping date requirement.

3. Similar to above. First 3 match (after PK), dates overlap, AND either Quantity matches or Quantity2 matches.

I'm needing to the same "grouping ID" to these records as well.

I did my best to provide some sample data but it's looking pretty ugly in this editor. Hopefully it is usable. Let me know if it isn't.

Here is a sample data set to work with.


CREATE TABLE #TEST (
ID INT NOT NULL PRIMARY KEY,
CustomerID BIGINT NOT NULL,
Code VARCHAR(7) NULL,
Warehouse VARCHAR(15) NULL,
authDateFrom DATETIME NULL,
authDateTo DATETIME NULL,
Quantity INT NULL,
Quantity2 INT NULL,
BeginTime DATETIME NULL,
EndTime DATETIME NULL
)

INSERT INTO #TEST VALUES('469','712998112','S5125','01321383','2014-08-31 00:00:00.000','2014-12-29 00:00:00.000','','','NULL','NULL')
INSERT INTO #TEST VALUES('476','712998112','S5125','01318996','2014-08-31 00:00:00.000','2014-12-29 00:00:00.000','','','NULL','NULL')
INSERT INTO #TEST VALUES('629','713294655','T1019','10076362','2014-04-29 00:00:00.000','2014-08-27 00:00:00.000','8','10','14:00:00.0000000','16:00:00.0000000')
INSERT INTO #TEST VALUES('631','713294655','T1019','10076362','2014-08-28 00:00:00.000','2014-12-26 00:00:00.000','8','12','14:00:00.0000000','16:00:00.0000000')
INSERT INTO #TEST VALUES('632','713294655','T1019','10076362','2014-08-28 00:00:00.000','2014-12-26 00:00:00.000','16','10','14:00:00.0000000','16:00:00.0000000')
INSERT INTO #TEST VALUES('633','713294655','T1019','10076362','2014-04-29 00:00:00.000','2014-08-27 00:00:00.000','16','12','14:00:00.0000000','16:00:00.0000000')
INSERT INTO #TEST VALUES('1024','718995634','T1019','01432996','2014-06-26 00:00:00.000','2014-07-12 00:00:00.000','','','NULL','NULL')
INSERT INTO #TEST VALUES('1025','718995634','T1019','01432996','2014-06-26 00:00:00.000','2014-07-12 00:00:00.000','','','NULL','NULL')
INSERT INTO #TEST VALUES('1683','713891849','T1019','01329004','2014-05-01 00:00:00.000','2014-08-29 00:00:00.000','12','','14:00:00.0000000','14:00:00.0000000')
INSERT INTO #TEST VALUES('1684','713891849','S5125','01329004','2014-05-01 00:00:00.000','2014-08-29 00:00:00.000','12','','14:00:00.0000000','14:00:00.0000000')
INSERT INTO #TEST VALUES('1687','713891849','T1019','01329004','2014-08-30 00:00:00.000','2014-12-28 00:00:00.000','12','','14:00:00.0000000','14:00:00.0000000')
INSERT INTO #TEST VALUES('1688','713891849','S5125','01329004','2014-08-30 00:00:00.000','2014-12-28 00:00:00.000','12','','14:00:00.0000000','14:00:00.0000000')
INSERT INTO #TEST VALUES('1946','716422348','S5125','01331675','2014-06-23 00:00:00.000','2014-10-21 00:00:00.000','20','','15:00:00.0000000','15:00:00.0000000')
INSERT INTO #TEST VALUES('1949','716422348','S5125','01331675','2014-06-23 00:00:00.000','2014-10-21 00:00:00.000','','','NULL','NULL')
INSERT INTO #TEST VALUES('1951','716422348','S5125','01393067','2014-06-23 00:00:00.000','2014-10-21 00:00:00.000','20','','15:00:00.0000000','15:00:00.0000000')
INSERT INTO #TEST VALUES('1952','716422348','S5125','01393067','2014-06-23 00:00:00.000','2014-10-21 00:00:00.000','','','NULL','NULL')
INSERT INTO #TEST VALUES('2344','712249329','T1019','01364731','2014-06-03 00:00:00.000','2014-10-01 00:00:00.000','','','NULL','NULL')
INSERT INTO #TEST VALUES('2345','712249329','T1019','01364731','2014-06-03 00:00:00.000','2014-10-01 00:00:00.000','','','NULL','NULL')
INSERT INTO #TEST VALUES('2626','714443152','S5150','01359799','2014-07-22 00:00:00.000','2014-07-22 00:00:00.000','','','NULL','NULL')
INSERT INTO #TEST VALUES('2627','714443152','S5125','01359799','2014-07-22 00:00:00.000','2014-07-22 00:00:00.000','','','NULL','NULL')
INSERT INTO #TEST VALUES('3483','714661120','T1019','01661876','2014-06-29 00:00:00.000','2014-10-27 00:00:00.000','14','','06:30:00.0000000','16:00:00.0000000')
INSERT INTO #TEST VALUES('3484','714661120','T1019','01661876','2014-06-29 00:00:00.000','2014-10-27 00:00:00.000','8','','06:30:00.0000000','16:00:00.0000000')

ScottPletcher
Aged Yak Warrior

550 Posts

Posted - 2014-08-11 : 13:21:26
Try this query for 2 and 3:


SELECT
t1.ID,
t2.ID,
t1.CustomerID,
t1.Code,
t1.Warehouse,
CASE WHEN t1.Quantity = t2.Quantity OR t1.Quantity2 = t2.Quantity THEN 1 ELSE 0 END AS Is_Quantity_Match
FROM #TEST t1
INNER JOIN #TEST t2 ON
t2.ID < t1.ID AND
t2.CustomerID = t1.CustomerID AND
t2.Code = t1.Code AND
t2.Warehouse = t1.Warehouse AND
t2.authDateFrom <= t1.authDateTo AND
t2.authDateTo >= t1.authDateFrom

Go to Top of Page
   

- Advertisement -