Quantcast
Channel: topsql - Query Writing
Viewing all articles
Browse latest Browse all 10

How to Pivot Data in SQL Server, Old-School Style

$
0
0

There are a handful of common requirements that lead SSRS report authors to seek the help of SQL developers and query writers. One such requirement is pivoting data.

Pivoting data displays rows as columns, at least that's how most people describe it when asking me for help. For example, a common way managers want to consume sales data is to have a row for each sales person and a monthly sales amount column for each month as is shown in the figure below.

Figure 1


Unfortunately, most data is stored in relational form as is shown by Query 1 and Results 1.

Query 1

use AdventureWorks2008R2
go
select
	p.FirstName + ' ' + p.LastName SalesPersonName
	, MONTH(h.OrderDate) OrderMonth
	, SUM(h.TotalDue) TotalSales
from
	Sales.SalesOrderHeader h
	join Person.Person p on p.BusinessEntityID = h.SalesPersonID
where
	h.OrderDate >= '20080101' and h.OrderDate 

Results 1


Reshaping the data in Results 1 to look like the data in Figure 1 is called pivoting and there are a handful of ways to do it.

This post will focus on what I call the Old-School style of pivoting in SQL Server. This is the Kool Moe Dee of pivoting.

Another post will focus on the New-School style of pivoting in SQL Server.

Down the road I'll also show you how to Pivot in Excel and with a few other common tools.

Old-School Pivoting in SQL Server

The first step is to layer a new query on top of Query 1. This new query will define the columns and what each represents.

Query 2

select
	x.SalesPersonName
	, case when x.OrderMonth = 1 then x.TotalSales else 0 end JanAmount
	, case when x.OrderMonth = 2 then x.TotalSales else 0 end FebAmount
	, case when x.OrderMonth = 3 then x.TotalSales else 0 end MarAmount
from
	(
		select
			p.FirstName + ' ' + p.LastName SalesPersonName
			, MONTH(h.OrderDate) OrderMonth
			, SUM(h.TotalDue) TotalSales
		from
			Sales.SalesOrderHeader h
			join Person.Person p on p.BusinessEntityID = h.SalesPersonID
		where
			h.OrderDate >= '20080101' and h.OrderDate 

If you break apart Query 2, you can see or original query has been put in the from clause of the outer query. This will allow us to further manipulate the data it returns.

You will also see a case statement representing each column we want to pivot, the column represents the value, and a name. If you execute this query you'll get the results below.

Results 2


So we've moved the values into columns, but instead of one row for each Sales Person we have three. To collapse these into one row, we have to do another aggregate as is shown in Query 3.

Query 3

select
	x.SalesPersonName
	, SUM(case when x.OrderMonth = 1 then x.TotalSales else 0 end) JanAmount
	, SUM(case when x.OrderMonth = 2 then x.TotalSales else 0 end) FebAmount
	, SUM(case when x.OrderMonth = 3 then x.TotalSales else 0 end) MarAmount
from
	(
		select
			p.FirstName + ' ' + p.LastName SalesPersonName
			, MONTH(h.OrderDate) OrderMonth
			, SUM(h.TotalDue) TotalSales
		from
			Sales.SalesOrderHeader h
			join Person.Person p on p.BusinessEntityID = h.SalesPersonID
		where
			h.OrderDate >= '20080101' and h.OrderDate 

Results 3


In this case, I added a SUM aggregate to each column and grouped it on Sales Person. In results 2 for Amy Alberts, there were three rows for January: 364.82, 0, 0. We simply added these together and collapsed the three rows into 1. The case statement will either include the correct value if it is the correct month, or it will return 0, which is then added together will always return only the correct value for the month.

Pivoting More Than One Column

One of the benefits of Old-School Pivoting is it is very easy to add another column to the pivot. Let's say you want two columns for each month: Jan Count will have the number of sales, Jan Amount will be the total sales amount as we have it in the previous results.

To add another column, we need to add its definition to the subquery in the from clause, and add additional columns to the select statement as is shown in Query 4.

Query 4

select
	x.SalesPersonName

	, SUM(case when x.OrderMonth = 1 then x.OrderCount else 0 end) JanCount
	, SUM(case when x.OrderMonth = 1 then x.TotalSales else 0 end) JanAmount

	, SUM(case when x.OrderMonth = 2 then x.OrderCount else 0 end) FebCount
	, SUM(case when x.OrderMonth = 2 then x.TotalSales else 0 end) FebAmount

	, SUM(case when x.OrderMonth = 3 then x.OrderCount else 0 end) MarCount
	, SUM(case when x.OrderMonth = 3 then x.TotalSales else 0 end) MarAmount

from
	(
		select
			p.FirstName + ' ' + p.LastName SalesPersonName
			, MONTH(h.OrderDate) OrderMonth
			, COUNT(h.SalesOrderID) OrderCount
			, SUM(h.TotalDue) TotalSales
		from
			Sales.SalesOrderHeader h
			join Person.Person p on p.BusinessEntityID = h.SalesPersonID
		where
			h.OrderDate >= '20080101' and h.OrderDate 

Results 4


So there you have it. Data pivoted in SQL Server just like we've been doing it for a long time. With SQL Server 2005, Microsoft introduced the PIVOT operator which I write about in How to Pivot Data in SQL Server - New-School Style.

One question I get a lot is if pivoting in SQL Server is fast/efficient. In general, pivoting data has little impact on performance as a whole. The FROM and WHERE clauses in the query generally have the largest impact, and simply reshaping the returned data has little impact.

In the case of single-column pivots, there is little difference between New-School and Old-School. In my analysis, the execution plan created by each is identical.

In the case of multiple-column pivots, Old-School Pivoting has a distinct advantage over New-School Pivoting because the same resultset can be reshaped into multiple columns. New-School pivoting requires a subquery for each pivoted column forcing SQL Server to build multiple datasets, pivot each, then join the pivots. I've played around with CTEs and other approaches but have yet to find a solution for pivoting multiple columns that performs as well as the Old-School approach. If anyone in the SQL Community can point me in the right direction, I'd appreciate it.

(See Post: How to Pivot Data in SQL Server, New-School Style)


Viewing all articles
Browse latest Browse all 10

Latest Images

Trending Articles





Latest Images