Server Side Paging using SQL Server 2005

By Bill Graziano on 4 January 2007 | Tags: SELECT


A common activity in applications is to page results or record sets from a database. This is usually done on the client using the client's paging functionality or on the server through a variety of methods. In SQL Server 2000 those server side methods typically used dynamic SQL or nested TOP clauses and weren't very efficient. Using Common Table Expressions in SQL Server 2005 we have a better way to page record sets on the server.

The Members Page on SQLTeam.com lists the first 20 members sorted by number of posts. In order to generate this it selects the entire set of members, returns it to the client and then displays the first 20 rows using ADO's paging functionality. The SELECT statement it runs looks something like this:

SELECT	M_NAME, M_POSTS, M_LASTPOSTDATE, M_LASTHEREDATE, M_DATE, M_COUNTRY
FROM	dbo.FORUM_MEMBERS
ORDER BY M_POSTS DESC;

It has a cost of 3.2 and does approximately 1,999 logical page reads to return the 25,000 registered members in our forum. It does have the benefit of being very simple to write.

Converting this query to a Common Table Expression (CTE) is straightforward and that looks like this:

WITH PagedMembers  AS
(
	SELECT	M_NAME, M_POSTS, M_LASTPOSTDATE, M_LASTHEREDATE, 
			M_DATE, M_COUNTRY
	FROM	dbo.FORUM_MEMBERS
)
SELECT	M_NAME, M_POSTS, M_LASTPOSTDATE, M_LASTHEREDATE, 
		M_DATE, M_COUNTRY
FROM	PagedMembers
ORDER BY M_POSTS DESC;

This query has the exact same execution plan, query cost and logical reads when you run it. At this point all the CTE has give us is a named wrapper around a simple query. The WITH clause defines the name of the CTE as "PagedMembers" In fact, it acts an awful lot like a derived table.

SQL Server 2005 adds a number of functions that are useful with common table expressions. One of these is ROW_NUMBER. If we include this in our CTE our new query looks like this:

WITH Members  AS
(
	SELECT	M_NAME, M_POSTS, M_LASTPOSTDATE, M_LASTHEREDATE, M_DATE, M_COUNTRY,
			ROW_NUMBER() OVER (ORDER BY M_POSTS DESC) AS RowNumber
	FROM	dbo.FORUM_MEMBERS
)
SELECT	RowNumber, M_NAME, M_POSTS, M_LASTPOSTDATE, M_LASTHEREDATE, M_DATE, M_COUNTRY
FROM	Members
WHERE	RowNumber BETWEEN 1 AND 20
ORDER BY RowNumber ASC;

The ROW_NUMBER function needs an OVER clause. In our example the OVER clause contains an ORDER BY clause. This just tells the ROW_NUMBER function in what order to generate the row numbers. It's possible to have multiple ROW_NUMBER functions in a single SELECT statement each with their own ordering. You can see we aliased the ROW_NUMBER function and then referenced that column in the WHERE clause of the outer SELECT statement. This only selects the first twenty members from those with the most posts.

This query has a cost of 0.3 and does 71 logical reads. It runs faster because it's able to use an index on M_POSTS that the other queries weren't able to take advantage of. If that index weren't there it would have the performance as the other query but return much less data to the client. As you increase the number of rows returned or how far down into the results you go the logical reads will increase. On my query the breakeven was around 500 rows. If I'm looking for a page that starts at row 500 or earlier this method is faster -- or at least has fewer logical reads. If I'm looking for a page that starts after row 500 then the table scan was faster. I'm most cases people aren't paging past the first few pages of results.

You can also use the ROW_NUMBER function in a straight SELECT statement like this:

SELECT	M_NAME, M_POSTS, M_LASTPOSTDATE, M_LASTHEREDATE, M_DATE, M_COUNTRY,
	ROW_NUMBER() OVER (ORDER BY M_POSTS DESC) AS RowNumber
FROM	dbo.FORUM_MEMBERS
ORDER BY M_POSTS DESC;

This will return a column for the row number but you won't be able to use it in the WHERE clause. You'll have to write it in a CTE or a derived table for that to work. Common table expressions also expect to be the first statement in a batch or they expect the statement preceding them to be terminated with a semicolon.

You can easily use variables to determine which rows to SELECT by passing in parameters for the page size and which page to return:

DECLARE	@PageSize INT,
	@PageNumber INT,
	@FirstRow INT,
	@LastRow INT

SELECT	@PageSize = 20,
	@PageNumber = 3

SELECT	@FirstRow = ( @PageNumber - 1) * @PageSize + 1,
	@LastRow = (@PageNumber - 1) * @PageSize + @PageSize ;

WITH Members  AS
(
	SELECT	M_NAME, M_POSTS, M_LASTPOSTDATE, M_LASTHEREDATE, M_DATE, M_COUNTRY,
			ROW_NUMBER() OVER (ORDER BY M_POSTS DESC) AS RowNumber,
			ROW_NUMBER() OVER (ORDER BY M_NAME DESC) AS RowNumber2
	FROM	dbo.FORUM_MEMBERS
)
SELECT	RowNumber, M_NAME, M_POSTS, M_LASTPOSTDATE, M_LASTHEREDATE, M_DATE, M_COUNTRY
FROM	Members
WHERE	RowNumber BETWEEN @FirstRow AND @LastRow
ORDER BY RowNumber ASC;

Common table expressions and the ROW_NUMBER function are a handy way to server side paging on a query result.


Related Articles

Joining to the Next Sequential Row (2 April 2008)

Writing Outer Joins in T-SQL (11 February 2008)

How to Use GROUP BY with Distinct Aggregates and Derived tables (31 July 2007)

How to Use GROUP BY in SQL Server (30 July 2007)

SQL Server 2005: Using OVER() with Aggregate Functions (21 May 2007)

Using XQuery, New Large DataTypes, and More (9 May 2006)

Counting Parents and Children with Count Distinct (10 January 2006)

Computing the Trimmed Mean in SQL (29 June 2004)

Other Recent Forum Posts

How Much Memory Is SQL Server Using? (19h)

Detailed search in a large sql file (21h)

How to handle a variable with an apostrophe (1d)

Get count for records (1d)

Calculate distance/length of linestring (2d)

Delete Duplicate (2d)

Why are queries hanging on ASYNC_NETWORK_IO? (2d)

Find all related query ids / queries executed for 1 SP (2d)

- Advertisement -