Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
Big_R
Starting Member
7 Posts |
Posted - 2014-09-25 : 15:00:19
|
My company has a lot of data that comes from a non-relational mainframe database written 30 or 40 years ago. We manufacture many different parts that have components (for example, hose assembly A is composed of fittings 1 and 2, and hose 3).My problem is that we have multiple locations, and sometimes the list of components is duplicated or partially duplicated for different locations. I would like to write a query that will choose components and give preference to the location of the top level assembly, but if a unique component only has a location different from the assembly, then go ahead and choose that. For example, if I pull the Bill of Materials for Hose assembly A, I get the following:Level 0 is always the top level Assembly.Assembly Component Location LevelA A 2x 0A 1 2x 1A 2 2x 1A 2 2r 1A 3 2n 1A 3 2r 1The result set I want to get out of the table is the following:Assembly Component Location LevelA A 2x 0A 1 2x 1A 2 2x 1A 3 2n 1 (or 2r, it doesn't matter).Has anyone ever had to do this? I'm thinking two CTE's, one containing all the parts that have the same location as the Assembly, and another one with only parts that do not have a record with the same location as the Assembly record. I will be trying that solution while I'm waiting on a reply to this thread.Big_R |
|
Bustaz Kool
Master Smack Fu Yak Hacker
1834 Posts |
Posted - 2014-09-25 : 19:20:07
|
[code]declare @tbl table ( Assembly varchar(10) not null, Component varchar(10) not null, Location varchar(10) not null, Level int not null )insert into @tbl (Assembly, Component, Location, Level)values ('A', 'A', '2x', 0), ('A', '1', '2x', 1), ('A', '2', '2x', 1), ('A', '2', '2r', 1), ('A', '3', '2n', 1), ('A', '3', '2r', 1);with Hierarchyas (select p.Assembly, p.Component, p.Location, p.Level, cast(Null as varchar(10)) ParentLocationfrom @tbl pwhere p.Level = 0union allselect c.Assembly, c.Component, c.Location, c.Level, p.Locationfrom Hierarchy pinner join @tbl c on p.Assembly = c.Assembly and p.Level = c.Level - 1),OrderedHierarchy as (select h.*, row_number() over (partition by Assembly, Component order by case when h.Location = h.ParentLocation then 1 else 2 end, h.Location) rnfrom Hierarchy h)select Assembly, Component, Location, Levelfrom OrderedHierarchywhere rn = 1order by Assembly, Level, Component[/code]Et voila! Too often we enjoy the comfort of opinion without the discomfort of thought. - John F. Kennedy |
|
|
Big_R
Starting Member
7 Posts |
Posted - 2014-09-26 : 11:15:29
|
That worked pretty good for the first level, but I ran into an issue with the row_number() function when I added a second level. The order by h.Location doesn't work then because the locations aren't necessarily in alphabetical order. Try it with this data set:insert into @tbl (Assembly, Component, Location, Level)values ('A', 'A', '2x', 0), ('A', '1', '2x', 1), ('A', '2', '2x', 1), ('A', '2', '2r', 1), ('A', '3', '2n', 1), ('A', '3', '2r', 1), ('A', '4', '2x', 2), ('A', '4', '2r', 2), ('A', '4', '2n', 2)What I didn't tell you is that some assemblies have over 500 components and may go 6 levels deep. That was my bad. However, I hadn't thought to use the row_number() function to deal with any duplicates in different locations. Thanks for that insight.Here's what I've come up with so far:DECLARE @tbl TABLE([Assembly] VARCHAR(10)NOT NULL , [Component] VARCHAR(10)NOT NULL , [Location] VARCHAR(10)NOT NULL , [Level] INT NOT NULL);INSERT INTO @tbl([Assembly] , [Component] , [Location] , [Level])VALUES('A' , 'A' , '2x' , 0) , ('A' , '1' , '2x' , 1) , ('A' , '2' , '2x' , 1) , ('A' , '2' , '2r' , 1) , ('A' , '3' , '2n' , 1) , ('A' , '3' , '2r' , 1) , ('A' , '4' , '2x' , 2) , ('A' , '4' , '2r' , 2) , ('A' , '4' , '2n' , 2);WITH TopLevelAS ( SELECT [Assembly] , [Component] , [Location] , [Level] FROM @tbl WHERE [Level] = 0 ),SameLocAS ( SELECT tb.[Assembly] , tb.[Component] , tb.[Location] , tb.[Level] FROM @tbl tb JOIN TopLevel TL ON tb.[Assembly] = TL.[Assembly] AND tb.[Location] = TL.[Location] ),DiffLocAS ( SELECT tb.[Assembly] , tb.[Component] , tb.[Location] , tb.[Level] , ROW_NUMBER() OVER (PARTITION BY tb.[Assembly],tb.[Component],tb.[Level] ORDER BY tb.[Location]) AS [Rank] FROM @tbl tb LEFT OUTER JOIN SameLoc SL ON tb.[Assembly] = SL.[Assembly] AND tb.[Component] = SL.[Component] AND tb.[Level] = SL.[Level] WHERE SL.[Assembly] IS NULL )SELECT [Assembly] , [Component] , [Location] , [Level] FROM SameLocUNION SELECT [Assembly] , [Component] , [Location] , [Level] FROM DiffLocWHERE [Rank] = 1ORDER BY [Level] What do you think?Big_R |
|
|
Bustaz Kool
Master Smack Fu Yak Hacker
1834 Posts |
Posted - 2014-09-26 : 11:56:46
|
My logic was always picking the Component relative to its immediate parent assembly's location; it should be using the root location (I think). I have tweaked the code a bit. See if this gets the results expected:declare @tbl table ( Assembly varchar(10) not null, Component varchar(10) not null, Location varchar(10) not null, Level int not null )insert into @tbl (Assembly, Component, Location, Level)values ('A' , 'A' , '2x' , 0) , ('A' , '1' , '2x' , 1) , ('A' , '2' , '2x' , 1) , ('A' , '2' , '2r' , 1) , ('A' , '3' , '2n' , 1) , ('A' , '3' , '2r' , 1) , ('A' , '4' , '2x' , 2) , ('A' , '4' , '2r' , 2) , ('A' , '4' , '2n' , 2); --('A', 'A', '2x', 0), --('A', '1', '2x', 1), --('A', '2', '2x', 1), --('A', '2', '2r', 1), --('A', '3', '2n', 1), --('A', '3', '2r', 1);with Hierarchyas (select p.Assembly, p.Component, p.Location, p.Level, p.Location HomeLocationfrom @tbl pwhere p.Level = 0union allselect c.Assembly, c.Component, c.Location, c.Level, p.HomeLocationfrom Hierarchy pinner join @tbl c on p.Assembly = c.Assembly and p.Level = c.Level - 1),OrderedHierarchy as (select h.*, row_number() over (partition by Assembly, Component, Level -- Not sure if this is needed order by case when h.Location = h.HomeLocation then 1 else 2 end, h.Location) rnfrom Hierarchy h)select Assembly, Component, Location, Levelfrom OrderedHierarchywhere rn = 1order by Assembly, Level, Component On the other hand I might be completely out to lunch... You decide. Too often we enjoy the comfort of opinion without the discomfort of thought. - John F. Kennedy |
|
|
Bustaz Kool
Master Smack Fu Yak Hacker
1834 Posts |
Posted - 2014-09-26 : 11:59:26
|
Stray thought: You could probably save the "Home Location" once in a variable by selecting from the root Component instead of pulling it through each successive level. It might make the script incrementally more performant. Too often we enjoy the comfort of opinion without the discomfort of thought. - John F. Kennedy |
|
|
Big_R
Starting Member
7 Posts |
Posted - 2014-09-26 : 15:30:18
|
I like the logic of your solution, the code is a little more elegant than my attempt.Unfortunately, when I tried your method against the real table, it ran into some performance issues. If you take your method and simply select * from hierarchy, the 9 rows in the original data turn into 21 rows in hierarchy, with 15 of those rows having to do with component 4. It looks like there might be some sort of factorial progression going on depending on the depth of the levels.I usede your query on the real table, using an assembly that has 529 parts and 3 levels, and let it run for two hours before I killed it. My script runs in less than a second.Big_R |
|
|
Big_R
Starting Member
7 Posts |
Posted - 2014-09-26 : 16:03:54
|
Just had a big 'd'oh!' moment. Turns out the table is sitting in the development database as a heap - no indexes on it at all. So I added an index and your script ran in a little over two minutes, and returned 62,129 rows for the hierarchy cte. I think your bottleneck might be on the recursive part of your cte:from Hierarchy pinner join @tbl c on p.Assembly = c.Assembly and p.Level = c.Level - 1 You're not accounting for the components in the join and you're winding up with a partial Cartesian product.Big_R |
|
|
Bustaz Kool
Master Smack Fu Yak Hacker
1834 Posts |
Posted - 2014-09-29 : 16:27:08
|
quote: I think your bottleneck might be on the recursive part of your cte:
My bottleneck???You could probably add some additional logic to prune the cte but I've taken this as far as I'd care to. I hope that you found this helpful and good luck. Too often we enjoy the comfort of opinion without the discomfort of thought. - John F. Kennedy |
|
|
Big_R
Starting Member
7 Posts |
Posted - 2014-09-30 : 09:09:09
|
Hey, no offense intended! I'm just reporting on what I found. Anyway, I wanted to thank you because your ideas have really helped me get over my roadblocks.Big_R |
|
|
Bustaz Kool
Master Smack Fu Yak Hacker
1834 Posts |
Posted - 2014-09-30 : 13:26:11
|
No worries, mate. Glad to be of service. Too often we enjoy the comfort of opinion without the discomfort of thought. - John F. Kennedy |
|
|
|
|
|
|
|