SQL for Threaded Discussion (Part II)

By Bill Graziano on 7 December 2000 | Tags: Application Design


webguru22 writes "Hi SQL gurus, Here's a twist on the old "threaded message board query" question: I need to come up with a data model that will let me have elements in an "outline" with numbered sub-elements." Actually a paraphrased his question quite a bit. You can see the original question in the forums though.

This is the first question I've pulled out of the forums to answer. I've been wanting to write this article for quite a while now and webguru22 just gave me a good reason. Earlier I had an article on a table structure and code for threaded discussions. My goal was to make the queries as simple as possible. This article aims for the same result but keeps your table as simple as possible. It also allows us to do some pretty neat things with recursive stored procedures. My examples try to cover both posts and the hierarchial numbering scheme from the question. It should work for any generic "expanding hierarchy" problem.

We'll start with a table that looks like this:

CREATE TABLE [Posts] (
	[PostID] [int] IDENTITY (1, 1) NOT NULL ,
	[Subject] [char] (50) NULL ,
	[ParentID] [int] NULL ,
	[PostSortKey] [datetime] NOT NULL ) 
GO

ALTER TABLE [Posts] WITH NOCHECK ADD 
	CONSTRAINT [DF_Posts_SortKey] DEFAULT (getdate()) FOR [PostSortKey]
GO

The PostID column is our primary key for this table. Subject is the subject of the post. For simplicity sake I removed all other post related fields that were not absolutely necessary such as author and the body of the post. ParentID is the PostID of the post for the parent. If this post is a top level post, then this field will be zero. PostSortKey is the sort order for the posts. This will usually be a datetimefield. In my example here, I used a default of GETDATE() to populate the field. In the specific example of this question it could be whatever the user wanted.

My example will use a dataset that looks like this:

PostID      Subject                                            ParentID    PostSortKey
----------- -------------------------------------------------- ----------- -----------------------
1           First Post                                         0           2000-12-06 20:54:29.407
2           Second Post                                        0           2000-12-06 20:54:29.407
3           Child of First Post                                1           2000-12-06 20:54:29.407
4           Child of Second Post                               2           2000-12-06 20:54:29.407
5           Child of First Child                               3           2000-12-06 20:54:29.407
6           Another FP Child                                   1           2000-12-06 21:04:49.217
7           Smallest Kid                                       5           2000-12-06 21:18:49.203
8           Another Munchkin                                   3           2000-12-06 21:28:22.040
Populating a table like this should be very easy. You just need to insert a record with it's parent ID.

This example is composed of two pieces of code. The first is a SQL Script to create a temporary table and call the stored procedure. It looks like this:

Create Table #NestedPosts (
SortID int IDENTITY (1,1),
PostID int,
PostKey varchar(200),
PostLevel int )
exec getchildren 0, 1, ''
SELECT SortID, 
  P.PostID, 
  ParentID, 
  PostLevel, 
  PostKey = LEFT(PostKey, 10),
  PostSortKey = convert(varchar(19), PostSortKey, 120), 
  Subject = LEFT( SPACE( (PostLevel-1) * 2 ) + Subject , 40)
FROM #NestedPosts N JOIN Posts P
ON N.PostID = P.PostID
Order by SortID
DROP TABLE #NestedPosts
The temporary table is populated by the stored procedure. The final SELECT prints out the results. Next is the stored procedure. It looks like this:
CREATE PROC GetChildren 
(@ParentID int, @PostLevel int, @ParentPostKey varchar(200) ) AS

SET NOCOUNT ON

DECLARE @NextLevel int, @Counter int, @PostKey varchar(200)

SET @Counter = 1

-- Build a cursor to loop through all the kids of this post
DECLARE c1 CURSOR LOCAL FOR
SELECT PostID 
FROM Posts
WHERE ParentID = @ParentID
Order by PostSortKey ASC

OPEN c1

FETCH NEXT FROM c1
INTO @ParentID

WHILE @@FETCH_STATUS = 0
BEGIN

  -- Build a key up as we go
  IF @PostLevel = 1
    SET @PostKey = convert(varchar, @Counter)
  ELSE 
    SET @PostKey = @ParentPostKey + '.' + convert(varchar, @Counter)

  -- Put this record in the temp table
  INSERT #NestedPosts (PostID, PostKey, PostLevel) VALUES (@ParentID, @PostKey, @PostLevel)

  SET @NextLevel = @PostLevel + 1
  
  -- Process all the children for this post
  EXEC GetChildren @ParentID, @NextLevel, @PostKey
 
  SET @Counter = @Counter + 1
 
  -- And get the next record at this level
  FETCH NEXT FROM c1
  INTO @ParentID 
END

CLOSE c1
DEALLOCATE c1

SET NOCOUNT OFF

This stored procedure runs through all the child posts for a given parent post and puts them into the temp table. It also builds up the hierarchical numbering scheme that was asked about in the question. The really tricky part is that it calls itself in order to handle any child records of the current record it's processing. You can have up to 32 levels of nesting in your stored procedure. That means this little sample is limited to 32 levels of posts. I also could have used a variable called @@NESTLEVEL which SQL Server uses to track how far down a recursion tree you are. I built and manually maintained a variable called @PostLevel for the task.

The output from the SELECT statement above looks like this:

SortID      PostID      ParentID    PostLevel   PostKey    PostSortKey         Subject
----------- ----------- ----------- ----------- ---------- ------------------- ------------------------- 
1           1           0           1           1          2000-12-06 20:54:29 First Post
2           3           1           2           1.1        2000-12-06 20:54:29   Child of First Post

3           5           3           3           1.1.1      2000-12-06 20:54:29     Child of First Child
4           7           5           4           1.1.1.1    2000-12-06 21:18:49       Smallest Kid
5           8           3           3           1.1.2      2000-12-06 21:28:22     Another Munchkin
6           6           1           2           1.2        2000-12-06 21:04:49   Another FP Child
7           2           0           1           2          2000-12-06 20:54:29 Second Post
8           4           2           2           2.1        2000-12-06 20:54:29   Child of Second Post

That's about all there is to this. Try it and let me know what you think.

-graz


Related Articles

Application Locks (or Mutexes) in SQL Server 2005 (7 January 2008)

What I Wish Developers Knew About SQL Server (Presentation) (11 October 2007)

Multiple Active Result Sets (MARS) – Transactions and Debugging (17 June 2007)

Multiple Active Result Sets (MARS) (3 April 2007)

How SQL Server 2005 Enables Service-Oriented Database Architectures (8 September 2006)

Presentation: What I Wish Developers Knew About SQL Server (17 November 2005)

GeoCoding with SQL Server and Google (8 August 2005)

How to Asynchronously Execute a DTS package from ASP or ASP.NET (27 March 2005)

Other Recent Forum Posts

Query performance Call Center data (21h)

Looking for on Premises tool to read data from SQL logs and populate data warehouse (1d)

Possible SQL 2014 to 2016 Issue - Some Application Functions Slow/Failing (1d)

Working with multiple WHERE statements (1d)

Create a new field value that shows a difference in value from 2 fields (3d)

Hierarchy wise Sales Targets (3d)

Get the MaxDate in results (5d)

Switch Statement within a SSRS SQL Expression (6d)

- Advertisement -