Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
AskSQLTeam
Ask SQLTeam Question
0 Posts |
Posted - 2002-05-01 : 17:37:25
|
Hierarchies are sometimes difficult to store in SQL tables...things like trees, threaded forums, org charts and the like...and it's usually even harder to retrieve the hierarchy once you do store it. Here's a method that's easy to understand and maintain, and gives you the full hierarchy (or any piece of it) very quickly and easily. Article Link. |
|
Onamuji
Aged Yak Warrior
504 Posts |
Posted - 2002-05-01 : 18:02:27
|
it seems everytime i post something on the more fun topics rob always comes out with an article a few days later ;-) heh seems like it at least for the last 3 articles :-p he always gives a superior technique that makes me very excited and want to .... never mind ... good artcile because i use hierarchies all over the place ... |
|
|
rrb
SQLTeam Poet Laureate
1479 Posts |
Posted - 2002-05-01 : 20:15:53
|
Thanks Rob. I too found Joe's stuff a little daunting...I think I like this approach and can't wait for your next installment. Will you post a link here when you're done so we can subscribe "in advance"?Ta--I hope that when I die someone will say of me "That guy sure owed me a lot of money" |
|
|
AjarnMark
SQL Slashing Gunting Master
3246 Posts |
Posted - 2002-05-01 : 20:30:22
|
GREAT Article, Rob! I was intrigued by Joe Celko's technique, but I too was a bit daunted by it. Learned some good things from it, though.Just about to get started on an org chart type of project, and definitely like what you've got here. Looking forward to part 2, too! Haven't really done anything with XML. This should be fun! |
|
|
hello_ash
Starting Member
10 Posts |
Posted - 2002-05-02 : 00:34:52
|
Great article, I faced same problem but i solved using recursion but now i got very new idea. Tahnx Rob. |
|
|
Lavos
Posting Yak Master
200 Posts |
Posted - 2002-05-02 : 02:55:52
|
While this is a usable technique, it still falls down when you need to represent a graph instead of a tree.The classical example is a bill of materials where you have 4 different parts that use the same bolt (or share a common component. This is something that I've had to do a short while ago.), but you can still run into it with an org chart.For instance, a little over month ago I was put "on loan" to the IS department. I had one person who was nominally in charge of the programming work I was doing (but was too busy to really look at it, so I had free reign to design and implement), the IS manager, and my normal manager who I still reported to.Now that I'm back doing my "secretary" work, I still have one immediate supervisor and nominally report to the second shift superintendant also. (at the same time, when my immediate supervisor is gone, I'm in charge of the department, even though I have no authority. lol.)In those cases, you're still back to an iterative process, and storing the relationships in another table to maintain normalization. (I think that is also an adjacency model, but I could be wrong on the exact terminology.)Don't get me wrong, there are some nice aspects of this approach, but it's not something that I've been able to use in any of my projects. (I haven't dug into the nested set model for the same reason.)----------------------"O Theos mou! Echo ten labrida en te mou kephale!" |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2002-05-02 : 08:56:36
|
A-ha! Actually, you can create any number of independent trees with this structure, that's why I spun off the data into a Tree table. You could simply create a "Boss2" column, then run the lineage UPDATE for Boss2 instead of Boss/Boss1. Or, add a new node with the same EmployeeID, with the appropriate parent node and lineage. That's why splitting the employee info from the node info is a good idea; it lets you do things like this.I'll elaborate on this a little more in Part 2, but oh yeah, you can definitely have two bosses with this setup, you just have to do a little more work with it. |
|
|
Lavos
Posting Yak Master
200 Posts |
Posted - 2002-05-02 : 11:24:33
|
I noticed the independant tree feature, but things still break down for graphs.I'm going to move back to the bill of materials example.Part Foo is made up of a Bar and a Thinga. A Thinga is made up of a processed Majob.You have a part FooX, that's made up of the brand new Barzac, but also still uses the same Thinga.What happens when you re-engineer the Thinga, and make it out of a processed SomethingMaRether instead? You've got it in several trees so you have to alter each tree.Granted, that is still relatively easy to do, but my normalization instincts are popping in to tell me it's bad. (Unfortunately, I'm still too new at SQL to know that some denormalization is perfectly acceptable ;)----------------------"O Theos mou! Echo ten labrida en te mou kephale!" |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2002-05-02 : 11:53:38
|
I agree that the lineage method is probably not as ideal for nested sets/assemblies/BOM structures as Joe's nested set model is. I personally haven't made the mental leap that treats trees and nested sets/BOM as the same thing. The diagrams can be made to look the same, but in reality they do not function the same way. The bill of materials structure models something a little different from an org chart. Each part is part of a larger assembly, but that part cannot just be wantonly moved or removed from the assembly. You can't take a bolt in an automobile engine and promote it to be the parent assembly of the entire engine. On the other hand, a person in a corporate organization can be transferred, promoted or removed anywhere in the organization, and may or may not inherit subtrees of that organization. This was what I was concerned with when putting this method together. Since the hierarchy is truly arbitrary, it was intended to support any kind of node manipulation without regard to its current relationship to the hierarchy.Regarding the example you gave, a "Thinga" is actually a subtree, not a single node. It is a bit harder to manipulate subtrees using lineage vs. nested sets. If subtrees are more important than single nodes, I would recommend nested sets. If nodes are more important, lineage will probably work better for you. |
|
|
dsdeming
479 Posts |
Posted - 2002-05-02 : 14:22:37
|
This is an interesting method. I can't wait to see the next installment.I only have one problem: whenever I tried to enter the drummer's name into the table, the data just disappeared. It's like it spontaneously combusted or something. |
|
|
Lavos
Posting Yak Master
200 Posts |
Posted - 2002-05-02 : 18:18:33
|
Ahh, but the nodes and the sub-trees are equally important in a bill of materials (As an example, we track the coil steel that we buy, as well as the parts that we stamp out of the steel that later get welded/bolted/sold), and there is promotion and demotion. For instance, we could buy a component from a vendor (a node) and then decide that we can make that component ourselves, and so it would then be a subtree. (or vice versa.) Or, it's possible to re-engineer the production line so that instead of assembling components at another line, they are added to the finished part directly (and thus cutting out middle management? Or at least the intervening compl's)I think the BOM is conceptually identical to an organizational chart. If I just change the nouns and verbs, it would descibe the same thing.That said, I did just notice a feature of this approach.With the multiple independant trees, you don't _have_ to have the same children in each tree. (an example, a supervisor has two different managers. Some of the people he supervises should fall under one manager, while the rest would be under the other.) This has a distinct advantage (IMHO) over an adjacency model (Where you would have to store a "Department" code in the relationship table, and I've thought of a few other pieces of ugliness that this introduces.)So, my main question would be how to maintain the independant trees, when there _are_ interelations? (In the above example, one of the supervisor's employees falls underneath both manager's departments, or there is another manager who is over everyone.)----------------------"O Theos mou! Echo ten labrida en te mou kephale!" |
|
|
rrb
SQLTeam Poet Laureate
1479 Posts |
Posted - 2002-05-02 : 19:28:20
|
quote: I only have one problem: whenever I tried to enter the drummer's name into the table, the data just disappeared. It's like it spontaneously combusted or something.
That's because you didn't doinsert into tap (drummer)select name from gardeners where methodofdeath="bizarre accident"PS - it was cocaine...--I hope that when I die someone will say of me "That guy sure owed me a lot of money" |
|
|
jcelko
Esteemed SQL Purist
547 Posts |
Posted - 2002-05-06 : 13:44:23
|
>> However (you knew this was coming!) one of the issues I have with nested sets is the complexity required to do relatively simple tasks, like adding, deleting, or moving nodes in the tree. Even finding an employee's immediate supervisor or subordinates requires 3 self-joins AND a subquery! <<I havea book on nothing but trees in SQL in the works right now. over the years, people have found a bunch of neat tricks with the nested set model.To insert a new node, G1, under part G. We can insert one node at a time like this:BEGIN ATOMICDECLARE right_most_sibling INTEGER;SET right_most_sibling = (SELECT rgt FROM Frammis WHERE part = 'G');UPDATE Frammis SET lft = CASE WHEN lft > right_most_sibling THEN lft + 2 ELSE lft END, rgt = CASE WHEN rgt >= right_most_sibling THEN rgt + 2 ELSE rgt END WHERE rgt >= right_most_sibling; INSERT INTO Frammis (part, qty, wgt, lft, rgt) VALUES ('G1', 3, 4, parent, (parent + 1)); COMMIT WORK;END;The idea is to spread the lft and rgt numbers after the youngest child of the parent, G in this case, over by two to make room for the new addition, G1. This procedure will add the new node to the rightmost child position, which helps to preserve the idea of an age order among the siblings.To convert a nested sets model into an adjacency list model, which is the same as finding the immediate subodinates:SELECT B.emp AS boss, P.emp FROM OrgChart AS P LEFT OUTER JOIN OrgChart AS B ON B.lft = (SELECT MAX(lft) FROM OrgChart AS S WHERE P.lft > S.lft AND P.lft < S.rgt);--CELKO--Joe Celko, SQL Guru |
|
|
Page47
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2002-05-08 : 16:51:24
|
does anyone have an example of a good method for converting an adjacency model hierarchy table into a nested-set model table?create table #adjacency ( parent varchar(20) not null, child varchar(20) not null )insert #adjacencyselect 'mona','lisa' unionselect 'mona','joe' unionselect 'joe','timmy'create table #nestedset ( item varchar(20), lft int, rgt int )insert #nestedset (item,lft,rgt)select ????? I have an adjacency-model type table with several hundred thousand records and 12 levels of depth . . . a set-based conversion method would be best.<O> |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2002-05-08 : 16:58:02
|
In the article, there are several links to Joe's articles on nested sets, one of them has the code for doing the conversion. |
|
|
jcelko
Esteemed SQL Purist
547 Posts |
Posted - 2002-05-09 : 13:51:16
|
>>does anyone have an example of a good method for converting an adjacency model hierarchy table into a nested-set model table? ... I have an adjacency-model type table with several hundred thousand records and 12 levels of depth . . . a set-based conversion method would be best. <<I never found a pure set based method. To convert an adjacency list model into a nested set model, use a push down stack algorithm. Assume that we have these tables:-- Tree holds the adjacency modelCREATE TABLE Tree(emp CHAR(10) NOT NULL,boss CHAR(10));INSERT INTO TreeSELECT emp, boss FROM Personnel;-- Stack starts empty, will holds the nested set model CREATE TABLE Stack (stack_top INTEGER NOT NULL,emp CHAR(10) NOT NULL,lft INTEGER,rgt INTEGER);BEGIN ATOMIC DECLARE counter INTEGER;DECLARE max_counter INTEGER;DECLARE current_top INTEGER;SET counter = 2;SET max_counter = 2 * (SELECT COUNT(*) FROM Tree);SET current_top = 1;INSERT INTO Stack SELECT 1, emp, 1, NULL FROM TreeWHERE boss IS NULL;DELETE FROM TreeWHERE boss IS NULL;WHILE counter <= (max_counter - 2)LOOP IF EXISTS (SELECT * FROM Stack AS S1, Tree AS T1 WHERE S1.emp = T1.boss AND S1.stack_top = current_top) THEN BEGIN -- push when top has subordinates and set lft value INSERT INTO Stack SELECT (current_top + 1), MIN(T1.emp), counter, NULL FROM Stack AS S1, Tree AS T1 WHERE S1.emp = T1.boss AND S1.stack_top = current_top; DELETE FROM Tree WHERE emp = (SELECT emp FROM Stack WHERE stack_top = current_top + 1); SET counter = counter + 1; SET current_top = current_top + 1; END ELSE BEGIN -- pop the stack and set rgt value UPDATE Stack SET rgt = counter, stack_top = -stack_top -- pops the stack WHERE stack_top = current_top SET counter = counter + 1; SET current_top = current_top - 1; END IF;END LOOP;END;You wil need to translate my SQL/PSM into whatever target SQL 4GL you have, or put it into a host language program. --CELKO--Joe Celko, SQL Guru |
|
|
jcelko
Esteemed SQL Purist
547 Posts |
Posted - 2002-05-09 : 13:54:38
|
Arrgh! I copied the old file:INSERT INTO Stack SELECT 1, emp, 1, max_counter -- changed! FROM TreeWHERE boss IS NULL;DELETE FROM TreeWHERE boss IS NULL;WHILE counter < max_counter -- changed !LOOP ...--CELKO--Joe Celko, SQL Guru |
|
|
Page47
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2002-05-10 : 08:17:23
|
Sir Celko, I appreciate your response. I actually have been working with the push down stack model code that I adapted from another article of yours.I am having difficultiy with one aspect. What happens when your tree data looks like this...create table tree ( emp char (10) not null, boss char (10) )insert tree values ('Jerry',NULL)insert tree values ('Bert','Jerry')insert tree values ('Chuck','Jerry')insert tree values ('Donna','Chuck')insert tree values ('Eddie','Chuck')insert tree values ('Fred','Chuck')insert tree values ('Page47',NULL)insert tree values ('Sloan','Page47') ...Such that there are multiple trees in the forest, ie. multiple roots? I am assuming that I would want tree to look like....emp lft rgt ---------- ----------- ----------- Jerry 1 12Bert 2 3Chuck 4 11Donna 5 6Eddie 7 8Fred 9 10Page47 13 16Sloan 14 15 I am trying to work through the allpairs and leftmostpairs views and the TreeMerge procedure described here. Is this the best method for my situation or should I work towards adapting the stack algorithm for multiple roots?quote: The woods are lovely, dark and deep.But I have promises to keep,And miles to go before I sleep,And miles to go before I sleep.
Thanks...<O> |
|
|
Page47
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2002-05-10 : 16:09:41
|
I thought I might include my SQL Server translation of the adjacency to nested set conversion code as adapted from the Celko article. This seems to deal with a multi-root adjacency model pretty well....First the DDL . . .create table adjmodel ( emp char (10) not null, boss char (10) )gocreate table setmodel ( emp char(10), boss char(10), lft integer, rgt integer )gocreate view allpairsasselect t.boss as superior, t.emp as subordinatefrom setmodel twhere exists ( select 1 from setmodel where boss = t.emp) and t.boss <> t.empgocreate view leftmostpairsasselect distinct superior, (select min(subordinate) from allpairs where a.superior = superior) as subordinatefrom allpairs awhere superior = (select min(superior) from allpairs)gocreate index idx_adjmodel_boss on adjmodel(boss)create clustered index idx_adjmodel_emp on adjmodel(emp)create index idx_setmodel_empboss on setmodel(emp,boss)create index idx_setmodel_lftrgt on setmodel(lft,rgt)go ...here is some sample data...insert adjmodel values ('Jerry',NULL)insert adjmodel values ('Bert','Jerry')insert adjmodel values ('Chuck','Jerry')insert adjmodel values ('Page47',NULL)insert adjmodel values ('Donna','Chuck')insert adjmodel values ('Eddie','Chuck')insert adjmodel values ('Fred','Chuck')insert adjmodel values ('Sloan','Page47')go ...here is a proc needed...create proc adjmodelMerge @superior char(10), @subordinate char(10)asdeclare @size int, @insert_point intselect @size = 2 * ( select count(*) from setmodel where emp = @subordinate ), @insert_point = ( select min (lft) from setmodel where emp = @subordinate and boss = @superior ) - 1update setmodelset boss = case when boss = @subordinate then case when emp = @subordinate then null else @superior end else boss end, lft = case when (boss = @superior and lft > @size) then lft + @size else case when (boss = @subordinate and emp <> @subordinate) then lft + @insert_point else lft end end, rgt = case when (boss = @superior and rgt > @size) then rgt + @size + 2 else case when (boss = @subordinate and emp <> @subordinate) then rgt + @insert_point else rgt end endwhere boss in (@superior, @subordinate)delete setmodelwhere boss is null or emp is nullgo ...and finally the conversion dml...insert setmodelselect distinct t.boss, t.boss, 1, 2*(select count(*) + 1 from adjmodel where t.boss = boss)from adjmodel tinsert setmodelselect distinct t.emp, t.boss, 2*(select count(distinct emp) from adjmodel where emp <= t.emp and t.boss = boss), --in (emp,boss)), 2*(select count(distinct emp) from adjmodel where emp <= t.emp and t.boss = boss) + 1 --in (emp,boss)) + 1from adjmodel tdelete setmodelwhere boss is null or emp is nulldeclare @boss char(10), @emp char(10)while exists ( select 1 from leftmostpairs)begin select @boss = superior, @emp = subordinate from leftmostpairs exec adjmodelmerge @superior = @boss, @subordinate = @empend--and the proofselect * from setmodel order by boss,lft The boss column is the tree 'owner'; that is to say the emp at the root of that particular tree. The lft/rgt values refer to the node position in the boss's tree only.Now, I was not able to make the Page47 tree use the 13/16 lft/rgt values, but boss column seems to square that away.I would love input from any intereste party.(Rob, I like your model too and I don't mean to hi-jack this thread . . . hope you understand :) )<O> |
|
|
Dready
Starting Member
1 Post |
Posted - 2002-05-23 : 10:16:14
|
It's funny... I used your 'lineage' tip in my little web project: http://yesi.dread.ws/ . This is a MySQL webapp, but in an SQL approach I took the method you explain.The only thing I don't understand is why you keep a 'depth' field in your SQL Tree table: your 'lineage' field contains the way (in terms of node ids) to the root node, so the 'lineage' field already contains the 'depth' information.Anyway nice article ! |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2002-05-23 : 12:06:32
|
Thanks! It's a convenience really, you don't need it, but it takes up almost no space (you can make it a tinyint - 1 byte for up to 255 levels!) In case you need to know the depth, or count back or forward in levels, it's handy. You'd have to do something like this to calculate it from the lineage column:SELECT Len(Lineage)-Len(Replace(Lineage,'/','')) AS Depth FROM TreeDepending on your SQL product, the Len() and Replace() functions may not exist. |
|
|
Next Page
|
|
|
|
|