A technical blog about my projects, challenges, and discoveries in the world of data warehousing using SQL Server, Power BI Desktop, DevExpress, and more.

Monday, February 1, 2010

Rollup and Grouping Functions

If you have done many queries at all, you have probably used the GROUP BY clause to generate aggregate values across a subset of rows. For instance, here is a simple query to calculate the total amount given to each Purpose Code for each year of history in the Shelby v.5 database:

select
year(hst.CNDate) as GiftYear,
pur.Purpose as GiftPurpose,
sum(det.Amount) as TotalGiving
from
Shelby.CNHst as hst inner join
Shelby.CNHstDet as det on hst.Counter = det.HstCounter inner join
Shelby.CNPur as pur on det.PurCounter = pur.Counter
group by
year(hst.CNDate),
pur.Purpose
order by
year(hst.CNDate),
pur.Purpose

This query yields one row per year/purpose combination, showing the total receipts for each purpose in each year.


This is fine as far as it goes, but a simple addition can also give us subtotals for each year of all purposes and a grand total of all years and purposes. All you have to do is add the key words WITH ROLLUP at the end of the GROUP BY clause.

select
year(hst.CNDate) as GiftYear,
pur.Purpose as GiftPurpose,
sum(det.Amount) as TotalGiving
from
Shelby.CNHst as hst inner join
Shelby.CNHstDet as det on hst.Counter = det.HstCounter inner join
Shelby.CNPur as pur on det.PurCounter = pur.Counter
group by
year(hst.CNDate),
pur.Purpose with rollup
order by
year(hst.CNDate),
pur.Purpose

With that simple addition, the results would look like this:

The NULL values are rather unfortunate, though. It would probably be better to replace them with a descriptive phrase to show that the row is a total line. T-SQL includes a function called GROUPING() that helps with that. The GROUPING() function returns a 1 whenever the column inside the parentheses returns a NULL because it is part of a ROLLUP function. It returns a 0 if it is not NULL or if it is a NULL for some other reason other than a ROLLUP function. Thus we can use GROUPING to test for ROLLUP nulls and translate them into better values. Here is the example with the simple query we have been using.

select
case grouping(year(hst.CNDate)) when 1 then 'All Years' else cast(year(hst.CNDate) as varchar(4)) end as GiftYear,
case grouping(pur.Purpose) when 1 then 'All Purposes' else pur.Purpose end as GiftPurpose,
sum(det.Amount) as TotalGiving
from
Shelby.CNHst as hst inner join
Shelby.CNHstDet as det on hst.Counter = det.HstCounter inner join
Shelby.CNPur as pur on det.PurCounter = pur.Counter
group by
year(hst.CNDate), pur.Purpose with rollup
order by
year(hst.CNDate), pur.Purpose


And here is a sample of the results:


The final step would be to use GROUPING() in conjunction with the ORDER BY clause in order to move the total and subtotals to the bottom of each section, where most people expect to find them. I will leave that exercise for you.

No comments:

Post a Comment

Followers