A technical blog about my projects, challenges, and discoveries in the world of data warehousing using SQL Server, Power BI Desktop, DevExpress, and more.

Sunday, December 11, 2011

Use FOR XML to Create a Delimited List

In the past few years there have been only a handful of ideas that have revolutionized the way I write queries. Common table expressions ares one, and so are window functions. Both of those came along in SQL Server 2005, and so did the revolutionary idea I'm going to share today: the FOR XML function.

I am the first to admit that I am not an XML expert; in fact, I don't know much about it at all, but interestingly I use FOR XML to achieve a result that doesn't have anything to do with XML; rather, I use it to turn a set of rows into a comma delimited list. I can't even take the credit for the idea of using FOR XML in this way. About the only "new" approach I will take credit for is moving the approach out of the FROM clause as a "cross apply" function and into the SELECT clause as a subquery function.

I think perhaps the best way to explain this is to take a simple example and break it down into its component parts, one portion of the query at a time. In Shelby v.5's ShelbyQUERY module the following query will return one per person, and each row will have a comma delimited list of phone numbers for that person. The bold portion is the "primary query" that returns one row per person in the database. Each row contains the NameCounter and the full name, first and last concatenated together.

 Name = FirstMiddle + ' ' + LastName,
 PhoneList = stuff((select ', ' + Descr + ': ' + PhoneNu from Shelby.NAPhones p inner join Shelby.NAPhoneTypes t on p.PhoneCounter = t.Counter where p.NameCounter = n.NameCounter order by t.Counter for xml path('')), 1, 2, '')
from Shelby.NANames n

Now let's break down the column that constructs the phone list. It will make the most sense to work from the inside out. The bold portion below is a subquery that returns all of the phone numbers (and their type information) for each name. Normally this would trigger an error because subqueries in the SELECT clause cannot return more than one row of results. The FOR XML clause will allow us to ignore that rule in a way. Notice that the subquery includes an ORDER BY clause to sort the results. Notice that the subquery adds a comma in front of every value returned.

 Name = FirstMiddle + ' ' + LastName,
 PhoneList = stuff((select ', ' + Descr + ': ' + PhoneNu from Shelby.NAPhones p inner join Shelby.NAPhoneTypes t on p.PhoneCounter = t.Counter where p.NameCounter = n.NameCounter order by t.Counter for xml path('')), 1, 2, '')
from Shelby.NANames n

The FOR XML clause finishes off the subquery. The PATH('') specification tells the FOR XML function to return one row of values separated by a space.

 Name = FirstMiddle + ' ' + LastName,
 PhoneList = stuff((select ', ' + Descr + ': ' + PhoneNu from Shelby.NAPhones p inner join Shelby.NAPhoneTypes t on p.PhoneCounter = t.Counter where p.NameCounter = n.NameCounter order by t.Counter for xml path('')), 1, 2, '')
from Shelby.NANames n

Finally, the STUFF() function takes the first two characters from the string returned by the subquery and replaces them with an empty string (thereby cutting off the extra comma at the front of the string).

 Name = FirstMiddle + ' ' + LastName,
 PhoneList = stuff((select ', ' + Descr + ': ' + PhoneNu from Shelby.NAPhones p inner join Shelby.NAPhoneTypes t on p.PhoneCounter = t.Counter where p.NameCounter = n.NameCounter order by t.Counter for xml path('')), 1, 2, '')
from Shelby.NANames n

To give you a little more of a taste of how this function can work for you, here is a query that includes the phone list created above plus a list of Servant module profiles and a list of children for the individual.

 Name = FirstMiddle + ' ' + LastName,
 ServantProfiles = stuff((select ', ' + Profile from Shelby.NAProfiles p where p.NameCounter = n.NameCounter and p.Profile like 'Z%' order by Profile for xml path('')), 1, 2, ''),
 PhoneList = stuff((select ', ' + Descr + ': ' + PhoneNu from Shelby.NAPhones p inner join Shelby.NAPhoneTypes t on p.PhoneCounter = t.Counter where p.NameCounter = n.NameCounter order by t.Counter for xml path('')), 1, 2, ''),
 Children = stuff((select ', ' + c.FirstMiddle from Shelby.NANames c where c.FamNu = n.FamNu and n.UnitNu in (1,2) and c.UnitNu = 2 order by c.Birthdate for xml path('')), 1, 2, '')
from Shelby.NANames n

Sunday, November 6, 2011

SQL Saturday #96 Wrap-Up and Top 10

SQL Saturday #96 happened in Washington, D.C. yesterday, and it was a great event. It is a special event to me because it was my first opportunity to speak at a PASS event and give back to a community that has given me much and transformed my queries for the better time and time again. If you are visiting my blog because you came to one of my sessions, please leave me some feedback on the presentation -- constructive criticism is as welcome as praise.

If you would like my PowerPoint presentations along with the .sql files I used for the demos in SQL Server Management Studio, you can download them from the SQL Saturday #96 Schedule page.

And in the same vein as my comments on the PASS Summit, I'm going to post the top ten things I learned yesterday. These are in roughly chronological order.
1. If you ever go to a SQL Saturday event, definitely use the Speed Pass and print it off before you to the venue on the day of the event. If you don't, you'll likely be in a line waiting to check in before you can join the fun.

2. Apparently Twitter or some Twitter-related organization ranks people's "influence" based on their tweets and the number of people who respond to them. This ranking is quite important to many people, and it was a point of animated discussion in the speaker's room at this event.

3. I discovered that people who are coming to T-SQL and SQL Server from Oracle and other platforms can benefit from an introductory course such as my introduction to Common Table Expressions, even when their time of experience with SQL is greater than mine. I suppose this is self-evident, but it was a little eye-opening to me.

4. Even though features have been around since SQL Server 2005, some people who have been using T-SQL for a long time may have overlooked a feature that I take for granted now, such as the window functions. I was a little pleased when I heard a couple of attendees express surprise that it is possible to generate an aggregate value without a GROUP BY clause because it meant that I had something to share that I knew could help them tremendously.

5. Receiving applause for a presentation is truly gratifying, even when it is a customary act done for every presenter. It seemed authentic in both my sessions, and therefore it was meaningful to me.

6. It seems that for most Business Intelligence professionals, having the primary justification for having a data warehouse is the ability to create an OLAP cube for analysis. I'm not sure that will be my justification. I think its primary benefit as a companion to our OLTP database would be to capture historical data for slowly changing dimensions and a close second or even a tie for first would be to enable a clearer semantic model for our end users in report design.

7. The date dimension generally adds the most overhead to any OLAP cube because it generally adds 365 rows x however many years are tracked. If there is any way to summarize to the weekly granularity or less, that will be a huge win for the efficiency of the cube.

8. It is ever clearer that any data warehouse that might be built as a companion for the Shelby v.5 OLTP database will include a single "EntityNames" role-playing dimension that contains all the entity attributes across all the available modules supplemented with views that take the core table and breaks out the module-specific attributes.

9. Even a large software company like Microsoft can overlook a feature that the end user sees as patently obvious. The case in point for today is the ability to update automatically all the Reporting Services reports that use a report part that has been changed. It's a manual process for now.

10. The built-in change tracking mechanism in SQL Server may be a great option for customers who want to know when things have changed at a more detailed granularity than we capture in the OLTP tables. But it only captures three days' changes by default, so a secondary mechanism for moving that data to a flat file or supplemental table is needed to go further back from there.

There you have it, my top 10 things learned yesterday at SQL Saturday #96. I'll return to blogging more procedural stuff after today. Let me finish this entry by saying that I really enjoyed the privilege of sharing my knowledge with the PASS community, and I hope to have similar opportunities in the future. If I do, you'll definitely be reading about it here. :-)

Tuesday, October 18, 2011

Top Ten Things I Learned at SQL PASS Summit on 10/14/2011

Travel and busy-ness with my baby Destiny have prevented my posting this last installment in my list of top lessons learned at SQL PASS, but now here it is:

1. Buck Woody and Rob Farley are not only superstars in SQL technologies, they are talented as singer/songwriters too. Check out the video of the opening of the keynote session for proof. Cue it up at 00:02:27 to skip to the song. Their song was my favorite moment of the week.

2. On Wednesday I wrote that I wanted to learn more about Hadoop since it was part of a new feature of SQL Server 2012, and today I learned more about it than I really wanted to know, but the presentation by Dr. DeWitt was fascinating. Hadoop is a method of "divide and conquer" applied to massive amounts of unstructured (or more accurately loosely structured) data, the kind collected by websites such as Facebook, eBay, and Google.

3. If you add NOSPLASH after the path to SSMS executable in the shortcut properties, the application will open a few seconds faster. It does not seem like much, but it's a handy tip that I put in place right away.

4. If there are more than a couple of OR conjunctions in any given WHERE clause, using UNION ALL is probably a better approach. The query optimizer in SQL Server is much better at streamlining two or more small query conditions rather than one huge, complex one.

5. When possible, replace correlated subqueries in the SELECT clause with joins in the FROM clause. This is because the correlated subquery must execute once for every row in the main results, eliminating any optimization that is possible when the data are correlated in the FROM clause.

6. I learned how to create a "temporal snapshot fact table" for a data warehouse solution. This improves the efficiency of a "daily summary" fact table by using date ranges instead of individual days as the granularity level, ranges during which nothing changes in the data. I suspect this may come in handy when constructing fact tables for pledges and donations. It is definitely a nice concept to keep in the toolbox of report design.

7. I learned that the query optimizer can anticipate the best query plan even when part of the query statement has been "refactored" into a common table expression (CTE).

8. Thanks to Audrey Hammonds, I learned some great techniques for explaining CTEs effectively. I plan to use some of her presentation style and content ideas when I present an introduction to CTEs at SQL Saturday #96 in Washington, DC.

9. CROSS APPLY is comparable to an inner join and OUTER APPLY is comparable to a left join. In that same context, I learned that there is no difference in query efficiency to move a CROSS APPLY table valued expression into a correlated subquery within the SELECT clause. This is reassuring, since I have already made such a move with the FOR XML technique of turning row values into delimited strings.

10. Snow Crash by Neal Stephenson is a book I need to read. When Audrey referenced it in her presentation, about half the audience recognized it and praised it as a great read. She mentioned a few other books throughout her talk, but this one definitely stirred the strongest positive response.

Friday, October 14, 2011

Top Ten Things I Learned at SQL Pass on 10/13/2011

The learning continues...

1. The "AlwaysOn" feature of SQL Server 2012 greatly simplifies high availability through redundant servers and simultaneously optimizes reporting by allowing you to use secondary servers for reporting and the primary server for transaction processing. However, it is currently unknown whether or not AlwaysOn will be in any edition other than Enterprise. It seems likely that it is an Enterprise-only feature.

2. The "ColumnStore Index" feature in SQL Server 2012 could radically speed up queries against tables in the Shelby v.5 database, especially attendance history. I can't wait to try it out. I may install Denali to try it out even before the official release of 2012.

3. There is a version of SQL Server Express coming out with 2012 that can easily be embedded into applications. This could potentially be a way to simplify the distribution of Shelby v.5 for those customers who do not own a license of SQL Server Standard. I will be looking into it more back home.

4. The "Semantic Model" features of SQL Server 2012, when coupled with the "FileTable" feature make for a powerful combination, capable of doing searches for document content based not just on matching of strings but actually pairing of concepts. I was blown away by the demo. Any librarian in the world would have been salivating over the processes and results I saw.

5. Two more times in one day I was shown the fallacy of putting dates inside of a function in the WHERE clause. I will never do it again, I promise! But I also learned that datetime and smalldatetime are being depricated in SS2012, meaning that we should probably start using the other, new data types for time. I will pass that info along to our database developers.

6. I learned that by simply changing the varchar data type to nvarchar, we could be capturing special characters for names, addresses, and more, which would make the v.5 database friendlier to international users. I'm going to do some testing on this when I get home too, to verify what is happening in our db right now when someone puts special characters into the varchar fields.

7.  I need to kick the habit of not specifying the length of varchar values and others when I use them as variables. I've been snakebit by that one before, and what I learned today just reinforces that I need to specify length 100% of the time.

8.  Some people, no matter how smart they are about what you want to discuss, can just be too annoying to talk to. It's just better to bide your time and find someone who may not be as knowledgeable but who is willing to listen and to converse at a pace that matches your own.

9.  The 2013 Summit will be in Charlotte, NC. That will be much more convenient for me to attend, and I am very much looking forward to not having to adjust to a new time zone in order to participate.

10.  Making free form reports with PowerPivot data in the background is not only possible, it allows for greater control over formatting and the embedding of elements such as sparklines into the data results. Free form reports can even use slicers to control the content. The only downside is the fixed nature of the rows and columns, which is why PivotTables are so handy in the first place.

Thursday, October 13, 2011

Top Ten Things Learned at SQL PASS on 10/12/2011

Wednesday is when the conference starts in earnest, and so I definitely learned more than I can summarize here, but these ten things really stood out to me. As with my first list, this is roughly chronological over the course of the day.

  1. There are over 4,000 registered attendees at this event. Add in the presenters and the vendors, and you can imagine how packed the rooms are. It is good to have such a thriving community; and yet, it does strain the capacity of the venue.
  2. SQL Server Code Named "Denali" has officially been named "SQL Server 2012," and is expected to be released in the first half of next year. "Crescent" has officially been named "PowerView" and will be a part of the SS2012 release, but apparently it requires the Enterprise edition. This is unfortunate, because that will put it in reach of only a couple of our Shelby Systems, Inc. customers.
  3. SQL Server 2012 will provide connectors and (eventually) drivers for something called "Apache Hadoop." This is something I will need to learn more about. Apparently Hadoop is a way to consume gigantic amounts of essentially unstructured data (on the order of pedabytes) and generate meaningful output. I don't know that I'll ever need that in my current role at Shelby Systems, but it is an interesting feature.
  4. I learned several things during the demos in the keynote, including: "sentiment" is now a measureable fact that can be quantified and reported on through the Microsoft Data Marketplace and the upcoming "Data Explorer"; only John Wayne has made more movies than Samuel L. Jackson.
  5. Stephen Few is a name I heard multiple times with regard to optimizing report style and layout. I believe I'm going to have to look into his web site at www.perceptualedge.com and check out his books.
  6. If I can get ahold of some mapping files that include the ZIP code layer (the Census Bureau is said to have some in the public domain), I should be able to get the mapping feature in Reporting Services to map down to that level. That would be much better than the current limit of reporting at the state level.
  7. Satori Software (a vendor here as well as a partner with Shelby Systems, Inc.) could actually return latitude and longitude information back to the NAAddresses table if we added that to our service agreement with them. The additional licensing cost makes it unlikely that this feature would be a part of all our customer's ShelbyMAILROOM package, but it is interesting to know.
  8. The inactivated slicer problem I noticed in the Excel PivotTables that are created programmatically from ShelbyQUERY is probably triggered by some option in the automated set up of the table. A manually-create OLE DB connection can use slicers just fine. It appears to have something to do with an OLAP-related setting, but I'm going to have to follow up on this more with Microsoft. Helpfully, one of the PowerPivot techs gave me his contact info and invited me to e-mail him if I couldn't pinpoint it on my own. I have to give props to the Microsoft staff here at PASS, who are as helpful as they can be.
  9. I have been doing date filtering in the WHERE clauses of queries in a way that will impede the efficiency of the query, possibly adding multiple seconds of time to getting the query results.  I have to learn ways to avoid putting table columns into functions, even the DATEDIFF() function.
  10. The pie chart is not an effective way of communicating comparisons in reports. Bar charts are far an away the best chart for most comparisons. Gauges too are relatively ineffective communicators. But the bullet chart is exceedingly good at communicating "KPI" style data with more context. I will have to learn how to create them in Reporting Services.

Wednesday, October 12, 2011

Top Ten Things I Learned at SQL PASS on 10/11/11

More lessons from the SQL PASS Summit 2011:

  1. When your laptop is a block or more away, that is the point at which you will most need it urgently.
  2. It is quite a good feeling to be an "alumnus" of the event and be able to help the first timers with getting where they need to be and oriented to the essentials for the best PASS experience.
  3. Although the views I have built for use in the ShelbyQUERY environment of our software are helpful for simplifying query writing, they are not really "data warehouse" structures. They blend too many distinct "dimensions" of data together, and they confuse (at points) the dimension/fact dichotomy. Thus I cannot really use that work as the guideline for a data warehouse database design.
  4. A dimension that is a "role playing" dimension does not have to exist as multiple instances in the data warehouse, not even as views of the different roles. If two keys in the fact table reference the same dimension for distinct purposes, it is a "role playing" dimension.
  5. Pragmatic Works, a SQL consulting firm based in Jacksonville, Florida has actually designed a data warehouse solution for a large Baptist church in that state. I'm sure my jaw dropped when Brian Knight (founder of Pragmatic Works) told me this in today's precon session. How could this be, and I didn't know it?! I will be meeting with Brian at the Pragmatic Works booth later this week to find out more.
  6. The topic "Building a Microsoft Business Intelligence Platform" sounds too ambitious to be effectively covered in a one-day training session because it actually is. Yes, Brian and Devin (the presenters) touched on each phase of the process, but the touches were alternately too light and too technical, leading to frustration on my part and (I suspect) on the part of others in attendance. It was often a case of lingering too long on a technical detail of the process so that the presenters rushed the remaining steps in the current topic to move on to the next.
  7. Subway stores in Seattle do not stock provolone cheese and do not accept the points card that is so handy in the Subway stores in Virginia. On the other hand, they do have the caloric values of all the sandwiches and other menu items displayed on the menu.
  8. There is an interesting similarity in the number of languages one must learn to be a biblical scholar and to be a SQL Server developer. To be a biblical scholar, one must learn Hebrew, Greek, Aramaic, German, and Latin (at least). To be a SQL Server developer, one must learn T-SQL, MDX, DAX, VB.net, and the proprietary "expression language" of SSIS and SSAS. And, just like the languages of Hebrew and Aramaic, they can be similar enough to be confusing when you are trying to learn them all.
  9. The PASS Virtual Chapter for Business Intelligence has four to five live webinar-style presentations a month, generally over the lunch hour, and these webinars are then posted for free as recorded webinars the next day. I will definitely be joining this "chapter" of PASS to take advantage of the resources it provides and (perhaps down the road) to find a way to contribute back to the SQL community that has so helped me in my work these last couple of years.
  10. There are people here at PASS who are familiar with Shelby Systems, Inc, even apart from the Pragmatic Works folks who worked with our customer in Florida. I met a man at lunch today from Houston who attends a church where the staff uses Shelby v.5. He volunteers as IT support at his church of about 1000 members. They have been users of v.5 since before the converstion to SQL Server from MS Access, and he confirmed that the move was a tremendous improvement in the stability of the database.

Monday, October 10, 2011

Top 10 Things I Learned at SQL PASS on 10/10/2011

I have the priviledge of being here in Seattle for the SQL PASS Summit conference all week, including two days of "pre-convention" all-day training topics. Today was the first day of the conference, and I wanted to continue a tradition of summarizing the top ten things I learned today at PASS. These are in roughly chronoligical order.
  1. It is good to get back on a treadmill and work out, even though a four-month break means that I could only go 2.6 miles in 45 minutes instead of 3.1 miles in about 40 minutes.
  2. The scuttlebutt I heard last year is true: the Paramount Hotel is indeed a great place to stay for PASS, as it is reasonably priced and only a block or so from the convention center. And most of the walk there is under covered awnings. If you are familiar with Seattle weather in October, you know this is no small fact.
  3. Talking to and singing to my almost-three-month-old daugher Destiny over the phone is a pale substitute for doing those things with her in my arms, but even through the phone her baby talk can melt my heart.
  4. Apparently SQL PASS Summit attendees receive a new backpack every year.
  5. The Guidebook app for Android phones (and probably iPhones too) is a great tool for an event such as this conference, and it is definitely something we should consider using for the International Shelby Conference next year.
  6. In SQL Server Reporting Services, if you use a "cast" function (such as CDate or CStr) on a textbox value to give it a specific data type, it opens up a large set of VB.net methods that can make working with that value much easier. This is especially true for datetime values.
  7. I really need to become comfortable with creating and using stored procedures over plain Jane SQL queries and views. They offer distinct benefits, which appear to be worth the trouble of being unable to create them easily with ShelbyQUERY.
  8. In Reporting Services it is possible to create code blocks that do more involved VB.net functions than can be summarized into a single one-line of code in an expression. I knew this had to be the case, but today I learned how to make that happen.
  9. Using a "post back" concept in the Action property of a Reporting Services report element creates the ability to have interactive "buttons" that change the parameters of the report content; for instance, it is possible to add "slicer" style buttons to a chart or matrix report design.
  10. A walk from the hotel to Starbucks to get a refreshing shaken iced tea lemonade takes just enough time to wait out a false fire alarm at the hotel and return to see the firefighters climb into the fire engine and head out.

Thursday, September 15, 2011

Calculating ISO 8601 Dates in T-SQL

If you do not work with international dates much, you may never have heard of ISO 8601 and you may never need to know what I'm about to cover in this post. If so, consider yourself lucky. I am not so lucky. I have been called upon to design a few queries for international customers of Shelby Systems, and in a few cases the queries have required temporal (i.e. time-related) analysis based on the ISO 8601 standard. And although I was not lucky enough to avoid the task of working with ISO 8601 in SQL, I have learned quite a bit about temporal querying in the process, so what I have lacked in luck I have made up for in knowledge. I see that as a fair trade.

In brief, ISO 8601 is a set of rules standardizing the representation of dates and times to facilitate international commerce and communication. For an comprehensive discussion of the full ISO 8601 standard, read the article on Wikipedia. For the purposes of this post there are just three points you need to know about this standard:
  • It always expresses dates and times from the largest term to the smallest, i.e. YYYY-MM-DD.
  • It counts Monday as the first day of the week, not Sunday.
  • It counts the first week of the year as the week containing the first Thursday of the year, not the week containing the first day of the year.
Each of these facets of the ISO 8601 standard are at odds with the default ways in which T-SQL handles dates.  There are workarounds to each one, though each workaround is more complex than the last. I'll start with the easiest first.

Express Dates and Times from the Largest Term to the Smallest

The default ways to get a datetime value and express it as a string in date notation are the following:

convert(varchar, getdate())
cast(getdate() as varchar)

Both of these will return the result in the same format. For July 7th, 2011 at 4:13 pm, it would look like this:

Jul 7 2011  4:13PM

Apart from the three-letter abbreviation for the month and lack of a comma, this is a fairly typical way of expressing the date and time in America. But it is not at all in keeping with the ISO standard. Fixing this is fairly easy, though, because the CONVERT() function has an optional format parameter, and there are values that will return true ISO-formatted date and time results. Here are the values and their outputs for the same day and time as above:

convert(varchar, getdate(), 112) = 20110707 (date only)
convert(varchar, getdate(), 108) = 16:13:24 (time only)
convert(varchar, getdate(), 126) = 2011-07-07T16:13:24.173 (complete date and time)

For the last variation, notice the T separator between date and time. Also notice the inclusion of milliseconds as a decimal fraction of a second. The last format also includes hyphens in the date where the date-only format did not. Both variations are acceptable in the ISO 8601 standard. The format without the hyphens is the "basic format," and the format with the hyphens is the "extended format."

Count Monday as the First Day of the Week

This is actually the key reason why churches may be interested in using an ISO week even if they live in the United States or other countries that do not use the ISO 8601 standard as a general rule. Most churches that have Friday or Saturday worship events want to count the following Sunday worship as part of the same weekend for statistical purposes. The ISO standard does that be lumping Sunday in with the prior six days.

However, this is uniquely tricky for the ShelbyQUERY environment because it does not allow the command SET DATEFIRST 1, which would automatically tell SQL Server to count Monday as the first day of the week. One simple workaround is to simply subtract one day from the date when doing comparisons based on which week the date belongs to. This will treat Sunday as if it were Saturday of the prior week, and it will treat Monday as if it were the first day of the week. Simply using getdate()-1 wherever you use getdate() will do that trick. Just remember not to subtract the day when you want to display the actual calendar date.

For statistical calculations based on a Monday through Sunday week, group the values by the ISO week and the ISO "week year" (i.e., the year in which the week falls -- which can be a different year than the calendar year because the first few days and last few days of the year can fall into the other year's week count). How to calculate the ISO week value is the next topic.

Count the first week of the year as the week containing the first Thursday of the year, not the week containing the first day of the year.

This is the most difficult part of ISO, and it comes down to calculating the week of the year. The DATEPART() function has a week parameter option that returns the week of the year; however, it does not follow ISO rules. It will count January 1st as being in the first week of the year no matter when it falls during the week. If you are using SQL Server 2008 or later you have it easy because Microsoft introduced a new DATEPART() parameter option in 2008 that does follow the ISO 8601 standard: iso_week. If you are using SQL Server 2005 or earlier, though, you are going to have to calculate it the hard way.

This is the algorithm I use  in the query below to calculate the ISO week of the year:
  1. Let TargetDate equal the date for which we want to know the ISO week value.
  2. Let CurrentThursday equal the date of the Thursday associated with the same week as the TargetDate (taking into account that Sunday belongs with the prior Thursday).
  3. Let FirstThursday equal the date of the Thursday of the week containing January 4th (by definition January 4th is always in the first ISO week of the year), basing that on the year value of the CurrentThursday (not the TargetDate).
  4. Let ISO_Week_Value equal the difference, in weeks, between the FirstThursday and the CurrentThursday, plus one.
Here is a query that uses variables to reflect each step of the algorithm and includes two AllInOne versions that rely on only the @TargetDate variable and no others. The FROM clause at the end is only necessary in ShelbyQUERY because of a requirement in that environment to always include a FROM clause.

declare @DateOffset as smallint, @ThursdayOffset as smallint, @TargetDate as datetime, @CurrentThursday as datetime, @FirstThursday as datetime, @ISO_WeekNumber as tinyint

/* The @DateOffset is used to make sure that Monday - Wednesday are always in the same week as the following Thursday and that Friday - Sunday are always in the same week as the prior Thursday no matter what the @@datefirst value is.
The @ThursdayOffset is used to find Thursday no matter what the @@datefirst value is. */ 

set @DateOffset = case when @@datefirst > 4 then @@datefirst - 8 when @@datefirst > 1 then @@datefirst - 1 else 0 end
set @ThursdayOffset = case when @@datefirst < 5 then 5 - @@datefirst else 12 - @@datefirst end

set @TargetDate = getdate()

set @CurrentThursday = dateadd(day, @ThursdayOffset - datepart(dw, @TargetDate + @DateOffset), @TargetDate + @DateOffset)

set @FirstThursday = dateadd(day, @ThursdayOffset - datepart(dw, cast('1/4/' + datename(year, @CurrentThursday) as datetime) + @DateOffset), cast('1/4/' + datename(year, @CurrentThursday) as datetime) + @DateOffset)

set @ISO_WeekNumber = datediff(week, @FirstThursday, @CurrentThursday) + 1

 TestDate = @TargetDate,
 CurrentThursday = @CurrentThursday,
 FirstThursday = @FirstThursday,
 ISO_WeekNumber = @ISO_WeekNumber,
 ISO_WeekYear = year(@CurrentThursday),

 ISO_WeekNumber_AllInOne = datediff(week, dateadd(day, case when @@datefirst < 5 then 5 - @@datefirst else 12 - @@datefirst end - datepart(dw, cast('1/4/' + datename(year, dateadd(day, case when @@datefirst < 5 then 5 - @@datefirst else 12 - @@datefirst end - datepart(dw, @TargetDate + case when @@datefirst > 4 then @@datefirst - 8 when @@datefirst > 1 then @@datefirst - 1 else 0 end), @TargetDate + case when @@datefirst > 4 then @@datefirst - 8 when @@datefirst > 1 then @@datefirst - 1 else 0 end)) as datetime) + case when @@datefirst > 4 then @@datefirst - 8 when @@datefirst > 1 then @@datefirst - 1 else 0 end), cast('1/4/' + datename(year, dateadd(day, case when @@datefirst < 5 then 5 - @@datefirst else 12 - @@datefirst end - datepart(dw, @TargetDate + case when @@datefirst > 4 then @@datefirst - 8 when @@datefirst > 1 then @@datefirst - 1 else 0 end), @TargetDate + case when @@datefirst > 4 then @@datefirst - 8 when @@datefirst > 1 then @@datefirst - 1 else 0 end)) as datetime) + case when @@datefirst > 4 then @@datefirst - 8 when @@datefirst > 1 then @@datefirst - 1 else 0 end), dateadd(day, case when @@datefirst < 5 then 5 - @@datefirst else 12 - @@datefirst end - datepart(dw, @TargetDate + case when @@datefirst > 4 then @@datefirst - 8 when @@datefirst > 1 then @@datefirst - 1 else 0 end), @TargetDate + case when @@datefirst > 4 then @@datefirst - 8 when @@datefirst > 1 then @@datefirst - 1 else 0 end)) + 1,
  ISO_WeekYear_AllInOne = year(dateadd(day, case when @@datefirst < 5 then 5 - @@datefirst else 12 - @@datefirst end - datepart(dw, @TargetDate + case when @@datefirst > 4 then @@datefirst - 8 when @@datefirst > 1 then @@datefirst - 1 else 0 end), @TargetDate + case when @@datefirst > 4 then @@datefirst - 8 when @@datefirst > 1 then @@datefirst - 1 else 0 end))

from (select A = count(*) from Shelby.NANames) a


Thursday, July 28, 2011

Find the First Non-Null Value in a List

Recently a customer asked how to list the first available phone number for an individual, regardless of what phone type it might be. If there was a Main/Home phone, use it. But if not, list the cell phone instead. If the person had neither but did have a work phone, that would do fine. They only wanted one phone number column in the results, though.

In more technical terms, the customer wanted the first non-null phone number value from the NANames table for a given NameCounter, given the priority of Main/Home first if available, then cell, then work. As it happens, T-SQL provides a very handy function that returns the first non-null value from a list. That function is COALESCE. It works exactly like ISNULL, but instead of taking only two values in the parameter list, COALESCE can take any number of values. The first non-null value in the list is returned as the value of the function. If all the values are null, the value of the function is null.

Here is a simple query on the ShelbyDB database to show how COALESCE can return the first non-null phone number value from a list of phone type values:

 Name = FirstMiddle + ' ' + LastName,
 PhoneNumber = coalesce('Home: ' + MainPhones.PhoneNu, 'Cell: ' + CellPhones.PhoneNu, 'Work: ' + WorkPhones.PhoneNu)
 Shelby.NANames as Names left join
 Shelby.NAPhones as MainPhones on Names.NameCounter = MainPhones.NameCounter and MainPhones.PhoneCounter = 1 left join
 Shelby.NAPhones as CellPhones on Names.NameCounter = CellPhones.NameCounter and CellPhones.PhoneCounter = 4 left join
 Shelby.NAPhones as WorkPhones on Names.NameCounter = WorkPhones.NameCounter and WorkPhones.PhoneCounter = 2

This query assumes that the Main/Home PhoneCounter value is 1 (it always it), that the Work PhoneCounter is 2 (it always is) and that the Cell Phone PhoneCounter is 4 (it often is, but not always). You should verify the PhoneCounter values for the phone types you want if you use this type of query approach in your own Shelby v.5 database.

In the example I also appended a descriptor to identify which type of phone number was listed. Remember that concatenating a literal string to a null value still results in a null value.

COALESCE comes in handy for any situation where you want to supply the "first available" value from a list of possible values. I have used it to supply either Greeting Type, Salutation, or First Middle name, whichever is available. I have used it to populate an address column with either a preferred address type or the Main/Home address if the preferred type is not available.

Thursday, July 14, 2011

Multivalue Parameters in SSRS 2005

I just posted earlier today, but I need to post this topic while it is fresh on my mind. It took some digging, and I want to get this out there while I can remember it clearly. I want to thank Munish Bansal for his blog post on this topic, which was where I found the answer to my search on this question.

In SQL Server 2008 Reporting Services, handling multivalue parameters is relatively straightforward. You just set the parameter property to multivalue and make sure that the filter is set to the IN comparison instead of equals (=). But in SSRS 2005 it is not quite that easy. You set up the parameter the same way, but the filter works differently. Instead of adding it as a filter at all, you add the condition as a Parameter property and put the IN logic into the dataset query directly. For example, the query's WHERE clause would have a condition like this in it:

WHERE My_Column IN (@MyParameter)

Then in the Parameter property of the dataset query you use the JOIN function to turn the multivalued parameter information into a comma-delimited set of values, as shown below. (Click the image to see it full-size.)

Once this is done the parameter will work as desired.

Hopefully you are working in SSRS 2008 because the improved handling of multivalue parameters along with many other enhancements make it almost a non-decision to upgrade as soon as possible to the latest SQL Server version. Denali promises even more enhancements in Reporting Services, and I can't wait to see what's coming next.

Ethernet Over Power thermal problems

If my last post was a departure from my normal topics, this one is a detour through back roads. Don't worry, though, I am working on a lengthy post that will get me back on track with SQL topics.

In the meantime, let me explain something that happened to me last week. The very first symptom was that my desktop would not print to my network printer. The printer itself appeared to be working normally, and I could even pull up the printer's web interface from my desktop. But I could not print. As I tried to resolve the communication between my desktop and the printer, I found my overall network connectivity was becoming intermittently interrupted. The problem escalated until at one point I used the ipconfig utility to manually release my IP address and attempted to renew it. No renewal occurred; DHCP was not working for me.

I would have assumed the router itself was down or having trouble, except that the laptops (yes, my wife and I each have two -- each of us has one for work and one personal laptop) were working fine with their wireless connectivity directly to the router. The only device apparently affected was my desktop. Having concluded this, I ordered a replacement network adapter. My desktop has the new PCI Express ports and no standard PCI ports, and I discovered that network adapters that fit PCI Express ports are difficult to impossible to find in a regular store, even an electronics store. So I ordered it on the Internet, and I expect it to arrive on Monday.

In the meantime my IP phone from work started having trouble. And this made me start the whole process over of trying to find the problem. One device that I thought I had tested and eliminated from the equation was the ethernet-over-power (EoP) adapter that I use to bridge my office computer equipment with the router downstairs, which is just barely within the range of the wireless router signal. But when it became apparent that something more was wrong than just the network adapter in my PC, I took another look. I moved the EoP adapter from the wall plug where it normally sits and moved it downstairs. While carrying it, I noticed that the EoP adapter was not just warm; it was downright hot, especially  the metal ports that hold the RJ45 terminator.

It did not occur to me right then, but gradually I began to wonder if the problem was a thermal one. Our air conditioning has been on the fritz for three days, meaning that my office was warmer than usual. Perhaps the PoE adapter was getting too hot to work properly. To test this, I left the adapter unplugged all last night and plugged it in this morning. Voila. Everything is working again on the network, including my desktop and the IP work phone. Unless something more occurs (which I doubt), I am going to chalk this up to thermal issues. I will probably unplug the PoE adapter in the evening before going to bed and plug it back in when I get to my desk the next day.

I post this as a cautionary tale to those out there who use ethernet-over-power adapters. If your network connectivity becomes unstable, don't overlook the possibility that the EoP adapter is overheating. It might save you some time and expense. As for me, I will have to see about the Amazon.com return policy.

Wednesday, June 1, 2011

Reapplying the Notes Master Style in PowerPoint

This post is a departure from my usual topics because I have been working for several weeks on the handouts for the 2011 International Shelby Conference, coming up June 14th through 17th.

Our preferred format for handouts is in PowerPoint, and we print the handouts in Notes view so that we can have a place for notes at the bottom of each page and so that we can put a nice cover page and final bio page on each set of slides. In any case, we receive PowerPoint files in a variety of styles and settings. Many times the notes view does not match the preferred template setup, and even resetting the Notes Master does not immediately apply to the notes view of individual pages.

It can take several clicks per page to "reapply the master" to the notes view. For lengthy presentations, this can add up to a very monotonous and time-consuming task. When faced with a 60+ page presentation that needed the notes master re-applied to every single page, I decided to look for a better solution.

I found one. I found a VBA macro script that reapplies the notes master style to every page in the presentation in a single step, and I want to share it here for your benefit (and for my own next year when I am once again working on this task). I give credit to the PPTools web site for re-posting the macro script that saved the day for me.  I am also posting it here because it appears to be in the public domain. If I am incorrect about that and you are the holder of a copyright of the script, please let me know. The macro VBA script is:

Sub ApplyMasterToNotes()

' Modified version of code originally posted to
' msnews.microsoft.com public newsgroups by
' David Foster in May of 1999

Dim ctl As CommandBarControl
Dim oSl As Slide

' 700 is the control ID for Layout
Set ctl = CommandBars.FindControl(Id:=700)
ActiveWindow.ViewType = ppViewNotesPage

If (ctl Is Nothing) Then
MsgBox "command not available"
Exit Sub
End If

For Each oSl In ActivePresentation.Slides

' go to the current slide
ActiveWindow.View.GotoSlide (oSl.SlideIndex)

' Bring up the dialog

' send it the needed keystrokes
SendKeys "%r{enter}"


End Sub

If you would like some help implementing this code, I once again recommend the PPTools web site, which has an article on how to use VBA code in PowerPoint.

Thursday, April 21, 2011

ShelbyQUERY Text Limit

I apologize for originally posting this as an empty message. For some reason the message was lost when I clicked to publish it, and I just didn't have time to re-write the whole thing.

What I intended was to post the maximum number of characters that the ShelbyQUERY text editor can handle. I had to determine this maximum because a customer contacted me about an overflow error when she was using ShelbyQUERY, and it turned out she was exceeding the maximum character limit of the editor. Oddly, the editor can hold more text than the "maximum" technically allowed, and it can execute the query as well. But if you try to move the insertion cursor anywhere past the maximum limit, you will receive an overflow error. So it is just good to avoid hitting that limit if you can.

Through some experimentation, I discovered that the maximum number of characters is 32,766. However, carriage returns count as two characters. And spaces count as one each also.

You should not hit this limit unless you are doing some extremely complex or multi-layered query analysis. But if you start getting overflow errors in ShelbyQUERY, this is the first thing to check.

As a troubleshooting tip, I recommend copying the text of your query over into MS Word and using the Review > Word Count tool. Use the character count (with spaces) value and add double the number of lines. That is equal to the number of the "characters" in your query. This will give you an idea for how much text needs to be cut out of the query.

If you are in a real pinch to squeeze out every extraneous space, you can remove carriage returns and use a space instead. You can even remove the space character if there is a punctuation mark (such as a comma or a single quotation mark) to separate key words in the statement. The result will not be readable, but it may be enough to fit the limit of the ShelbyQUERY text editor.

A better "workaround" might be to just use MS SQL Server Management Studio. That tool has no arbitrary limit to the amount of text, and so you can execute extremely long query statements without any trouble at all.

Tuesday, March 8, 2011

The Semicolon and Common Table Expressions

In the last few months I have begun using Common Table Expressions. These are subqueries that could be used in the FROM clause of the primary query, but instead they have been moved into a WITH clause before the primary query statement, even above the SELECT clause.

Although I will explore Common Table Expressions in later blog posts, I just want to document something I discovered in relation to using them in ShelbyQUERY. Curiously, what I discovered is also related to the little-used semicolon punctuation mark.

The semicolon is the terminator symbol for a T-SQL command. This includes query statements and other functions supported in T-SQL. Therefore every query written in T-SQL could -- some might say should -- end in a semicolon to indicate the end of the statement. However, I will admit that none, or almost none, of the queries I write end with a semicolon. This is because the semicolon is optional in the great majority of situations. There is one situation where the semicolon is not optional, and that is for any T-SQL command that precedes a WITH clause to define common table expressions.

This is particularly important to note in ShelbyQUERY because experience has shown that ShelbyQUERY actually prefixes one or more T-SQL commands in front of the actual query that is written in the text editor. Although I do not yet know what these commands are exactly, I suspect they are there to enforce the table-level security restrictions of the Supervisor user security setup and also the General Ledger user account restrictions. Whatever the commands are, it has become clear that sometimes (though not always) the final command prior to the actual query does not terminate with a semicolon.

Of course in the vast majority of cases, the lack of a semicolon on the prefixed command is moot. However, if the query is to begin using WITH and common table expressions, it is definitely not moot. In such a case the lack of a semicolon on the prefixed command with trigger a syntax error near the word WITH.

The simplest solution is to add a semicolon before the word WITH, either on a line above it or just in front of it on the same line. The fact that T-SQL is not picky about spacing means that you can put it anywhere you like, as long as it is in front of the word WITH.

While I'm on the topic, I will also caution anyone out there who uses variables in your queries. If you put DECLARE and SET statements in front of a WITH clause, you also need to put the semicolon after the last statement prior to the word WITH. Of course, you could also put the semicolon after every complete statement. Putting the semicolon only in front of the word WITH is the minimum requirement, but it is certainly more rigorous to put one after every complete T-SQL statement.

Monday, March 7, 2011

Discovering the Compatibility Level of a Database

Even though a database is running in a particular version of SQL Server, it can have a compatibility level set to a prior version. If this is the case queries that rely on features implemented in the current version of SQL Server may fail to work on your database even though the SQL Server version meets the requirements.

To find out your database's compatibility level, run the following query:

select name, compatibility_level from sys.databases

The number corresponds to the version of SQL Server as shown below:

80 = SQL Server 2000 (or MSDE)
90 = SQL Server 2005
100 = SQL Server 2008

If the compatibility_level value is lower than the version of SQL Server you are using to host the database, you can change the compatibility level to match by using the SQL Server Management Studio software on the server. Right-click on the database name and choose Properties. Then choose the Options page and set the Compatibility Level drop-down option to the highest level available, which will be the same as the version of the SQL Server host.

Tuesday, January 18, 2011

Table Aliases - and an apology

First, I need to apologize for the lengthy interruption in my posting to this blog. For explanation I will just point to the Thanksgiving and Christmas holidays, the usual January crunch in the Shelby Systems offices when it is "all hands on deck" to answer Support questions, and also (perhaps primarily) the fact the fact that I hit a file-system permissions snag on the transactional replication process, and I have been unable to make more headway on that since my last post.

Anyway, while the transactional replication concept stews for awhile, I want to put out a quick post on using table aliases. This is a technique I have already used in a few earlier SQL query posts, in fact most of them, but I haven't given it an explanation of its own yet. Using a Support inquiry as my impetus, I will post that explanation now and revive this blog at the same time I address this question.

A table alias is nothing more than a "new name" for a database table. Table aliases are assigned in the FROM clause, and when they are used, all other clauses must reference the table by the alias instead of the original name. This applies to the SELECT clause as well. This may seem counter-intuitive because the SELECT clause appears before the FROM clause and thus "before" the alias has been assigned. But this is only in the visible order of operations. In the logical order of operations (the order that matters to the server), the FROM clause is interpreted first.

Table aliases are assigned simply by putting the alias name immediately after the table name in the FROM clause. An optional "AS" keyword can introduce the alias. If the alias contains a space character, brackets are required around the alias. Here is an example

SELECT Names.NameCounter, Names.FirstMiddle, Names.LastName
FROM Shelby.NANames AS NamesORDER BY Names.LastName, Names.FirstMiddle

Notice in this example that the alias of the NANames table is Names. The SELECT clause references the Names alias, not the original table name. And the ORDER BY clause also references the alias.

There are several advantages to using table aliases, including:
  • The ability to reference the same table multiple times, as long as each as a different alias. (This is perhaps the biggest advantage.)
  • The ability to give more meaningful names to tables.
  • The ability to give shorter names to tables.
  • The ability to type only once the fully qualified table name, which can include the database name, the schema name, and the table name, and instead use the alias alone for all other references to the table.

Because of these advantages, I use table aliases almost 100% of the time when I write queries. For examples in practice, view the query samples in these earlier blog posts:

Listing Husband and Wife on One Line
Calculating Age in T-SQL
Rollup and Grouping Functions
Pivoting Rows to Columns