Dynamic Cross-Tabs/Pivot Tables

By Rob Volk on 12 March 2001 | Tags: Dynamic SQL


IMHO, the best feature of MS Access is the TRANSFORM statement, used to create cross-tabs/pivot tables. It does all of the work of dynamically generating the cross-tabulation and the summary calculations. T-SQL unfortunately doesn't have this statement, so you're stuck using complicated SQL commands, expensive 3rd party products, or exotic OLAP to make pivot tables...or you can use the following procedure to dynamically create them!

I got the idea from this question, asking how to "undo" a pivot table, and then I started working on how to create them in T-SQL. There are numerous ways of doing pivot tables, and this site has several examples (and lots of other cool stuff). The standard method uses a CASE statement, with one CASE for each pivot value (the column headings created by cross-tabbing the pivot column). The greatest shortcoming is finding a way to handle an unknown or changing number of pivot values. Obviously you have to know these values beforehand, and you must add a CASE for each new, distinct value inserted into the pivot column. The code listed below will do all of the work for you:

CREATE PROCEDURE crosstab 
@select varchar(8000),
@sumfunc varchar(100), 
@pivot varchar(100), 
@table varchar(100) 
AS

DECLARE @sql varchar(8000), @delim varchar(1)
SET NOCOUNT ON
SET ANSI_WARNINGS OFF

EXEC ('SELECT ' + @pivot + ' AS pivot INTO ##pivot FROM ' + @table + ' WHERE 1=2')
EXEC ('INSERT INTO ##pivot SELECT DISTINCT ' + @pivot + ' FROM ' + @table + ' WHERE ' 
+ @pivot + ' Is Not Null')

SELECT @sql='',  @sumfunc=stuff(@sumfunc, len(@sumfunc), 1, ' END)' )

SELECT @delim=CASE Sign( CharIndex('char', data_type)+CharIndex('date', data_type) ) 
WHEN 0 THEN '' ELSE '''' END 
FROM tempdb.information_schema.columns 
WHERE table_name='##pivot' AND column_name='pivot'

SELECT @sql=@sql + '''' + convert(varchar(100), pivot) + ''' = ' + 
stuff(@sumfunc,charindex( '(', @sumfunc )+1, 0, ' CASE ' + @pivot + ' WHEN ' 
+ @delim + convert(varchar(100), pivot) + @delim + ' THEN ' ) + ', ' FROM ##pivot

DROP TABLE ##pivot

SELECT @sql=left(@sql, len(@sql)-1)
SELECT @select=stuff(@select, charindex(' FROM ', @select)+1, 0, ', ' + @sql + ' ')

EXEC (@select)
SET ANSI_WARNINGS ON

Yeah, I know, you want to know how it works :) I won't go into detail on the mechanics of the code. Let's just say that if you can follow it, feel free to play with it; if not, DON'T TOUCH IT! The syntax for the procedure call is below, followed by a description of each parameter:

12 34
EXECUTE crosstabSELECT statement,summary calculation,pivot column,table name
  1. The SELECT statement can be anything, as long as it has proper syntax and includes a GROUP BY clause. You can use JOINs, but if you use table aliases you should include the alias in the summary calculation expression (2).
  2. The summary calculation must have an aggregate function like SUM(), AVG(), MIN(), MAX(), etc. You'd have to modify the code if you want to use DISTINCT with these functions. COUNT(*) won't work, you have to COUNT on a column.
  3. The pivot column must be in the table (4). You can use an expression for the pivot column (a+b, LEFT(FirstName,3), etc.) as long as it can be derived from the table listed in (4). A cross-tab heading will be created for each distinct value in the pivot colum/expression.
  4. This table can be any table in your database, or another database if you use the full naming syntax (database.owner.table). Tables in a linked server may also work, but I haven't tested this. It's possible that a derived table (nested SELECT) can work, but I haven't tested this either. You would need to enclose the SELECT statement in parentheses, and use a table alias outside these parentheses, like this: '(SELECT LastName FROM myTable) AS Surnames'

I'll list some cross-tab settings and the results. Here's two you can run in the pubs database:

EXECUTE crosstab 'select title from titles inner join sales on (sales.title_id=titles.title_id) 
group by title', 'sum(qty)','stor_id','stores'
title638070667067713178968042
But Is It User Friendly? 30
Computer Phobic AND Non-Phobic Individuals: Behavior Variations 20
Cooking with Computers: Surreptitious Balance Sheets 25
Emotional Security: A New Algorithm 25
Fifty Years in Buckingham Palace Kitchens 20
Is Anger the Enemy? 3 75 10 20
Life Without Fear 25
Onions, Leeks, and Garlic: Cooking Secrets of the Mediterranean 40
Prolonged Data Deprivation: Four Case Studies 15
Secrets of Silicon Valley 50
Silicon Valley Gastronomic Treats 10
Straight Talk About Computers 15
Sushi, Anyone? 20
The Busy Executive's Database Guide 5 10
The Gourmet Microwave 25 15
You Can Combat Computer Stress! 35
EXECUTE crosstab 'select pub_name, count(qty) as orders, sum(qty) as total 
from sales inner join titles on (sales.title_id=titles.title_id) 
right join publishers on (publishers.pub_id=titles.pub_id) 
group by pub_name', 'sum(qty)','type','titles'
pub_nameorderstotalbusinessmod_cookpopular_comppsychologytrad_cookUNDECIDED
Algodata Infosystems613555 80
Binnet & Hardley7150 50 2080
Five Lakes Publishing0
GGG&G0
Lucerne Publishing0
New Moon Books820835 173
Ramona Publishers0
Scootney Books0

Here's one that will run in Northwind:

EXECUTE crosstab 'SELECT LastName FROM Employees INNER JOIN Orders 
ON (Employees.EmployeeID=Orders.EmployeeID) 
GROUP BY LastName', 'count(lastname)', 'Year(OrderDate)', 'Orders'

LastName199619971998
Buchanan111813
Leverling187138
Fuller164139
Peacock318144
Callahan195431
Suyama153319
King113625
Dodsworth51919
Davolio265542

Some things to look out for:

  • This procedure builds a SQL statement from your parameters, inserts the CASE statements to handle the cross-tabbing, and then executes it. The size of this statement is maxed out at 8000 characters, so there's a limit on the size and complexity of the SELECT statement and the number of pivoted values you can have.
  • This procedure uses a global temporary table, and will cause an error if two sessions execute it at the same time. (thanks for catching this Garth!) I will have a remedy soon and will post the fixed code in this article.
  • Date values can be pivoted, but remember that the timestamp portion will be included, so you should CONVERT() them in the pivot expression. NULLs can't be pivoted, you must use IsNull() to replace them with a value. I have not tested nvarchar or nchar data types, and the code may need to be modified to work with them. Nor have I tried ntext or text types, but I'm sure they can't be pivoted anyway.
  • You can use a HAVING clause, or include WITH CUBE or WITH ROLLUP in the SELECT statement and it will return the appropriate summary rows for each group.
  • You can even ORDER BY cross-tabbed columns, something you can't do with the TRANSFORM statement in Access.

I'm spending some time working on enhancements to this procedure (for a follow-up article), to allow things like subqueries, multiple pivoting columns, non-aggregate values (e.g. - show the store with the highest sales each month, pivoted by book type). I would love to hear from anyone who has suggestions or solutions on how to improve this code.

Enjoy!

-robvolk


Related Articles

Using Dynamic SQL in Stored Procedures (7 March 2011)

Passing an array of values to SQL Server (Stored Procedure) without parsing/string manipulation (26 March 2007)

Efficient and Dynamic Server-Side Paging with T-SQL (23 March 2004)

Introduction to Dynamic SQL (Part 2) (27 June 2001)

Introduction to Dynamic SQL (Part 1) (20 June 2001)

Implementing a Dynamic WHERE Clause (14 January 2001)

Other Recent Forum Posts

Calculating Numerators and Denominators (11h)

Access problem to linked server with mobile app (11h)

Xml query with custom xml namespace? (1d)

Data replication between two databases (5d)

Group by clause with multiple columns (5d)

SSRS error on sign in ERR_UNEXPECTED (8d)

SSIS Component C sharp source (9d)

Simple SQL Update Query behaviour changing based on record count (10d)

- Advertisement -