A technical blog about my projects, challenges, and discoveries in the world of data warehousing using SQL Server, Power BI Desktop, DevExpress, and more.

Monday, November 9, 2009

Registrations - Key Information

I worked today for almost two hours with a customer in California who was seeing unexpected information in a query I had written for them some time ago. There were three things wrong with the results, and I learned a lesson from each of them when I discovered the ways to correct them.

First, the query was showing incorrect Custom Field values. After double-checking that the Registrations attendee information was displaying the expected values, I started digging through the query and the tables to figure out what was going on. I spent almost 2/3 of the entire time looking for the cause of this problem. To make the long story short, what I discovered is that the [RGRegCatIndCust].Counter column is not a truly unique Primary Key column. I found that the same Counter value had been used for two different attendees' custom information answers in two different events. The information that was showing up in my query results were for the wrong attendee at the wrong event. In order to correct this problem I had to use both [RGRegCatIndCust].Counter and [RGRegCatIndCustSetup].RGEveCounter as limiters on the custom field values. Once I did that, the query displayed the expected values.

Second, the query was showing the same registrant twice. This was because the attendee had been assigned to the wrong Category at first, and when the attendee information was moved to the correct Category, the amount owed changed. A new invoice had been created to adjust the amount owed. Since my query was grouping by invoice, the second invoice created a second entry in the results. I had to go in and remove the grouping by invoice and then add aggregate functions to the SELECT column information that dealt with invoice info. Once that was done, the correct balances displayed. I learned to always use an in-line view for the Registrations invoice information, because of the way that I used GROUP BY to pivot the custom answers into columns.

Third, the query was returning the wrong number of results. It was 11 shy of the correct number of attendees. I found 10 of the missing 11 by using LEFT joins to the invoice-related tables instead of INNER joins to them. When an attendee had no balance due for registering, there was no invoice created, and therefore the INNER joins were eliminating those attendees. The final 1 person was trickier. In order to get the last person on the results, I had to re-do the FROM clause so that [RGRegCatInd] (the Attendee table) was at the root, making sure to do LEFT joins to all the subsequent tables. That was I was sure to get at least one row for each attendee. What I discovered after I made the change was that one person had registered via WebView and that somehow in the synchronization process the Registrant information was lost but the Attendee information remained. Therefore my original approach of using RGReg (the Registrant table) as the root table omitted the one person whose Registrant information was lost. After I got the final missing person's row in the results, I could readily see that the Registrant information was null.

It was a productive two hours, and I learned some tips that I will apply to all my future Registrations queries. And I will probably need to go back and change some of my prior queries as well.

Thursday, October 22, 2009

Richmond Code Camp

I should have posted a message about the Richmond Code Camp weeks ago, since it actually happened October 3rd, but time just has gotten away from me all month. This was my first Code Camp experience, but certainly not my last. In case you haven't been to one, I should explain that Code Camp is a completely free day of lectures from programmers and other developers, mostly local personalities, but sometimes notable figures from the larger programming world. The most interesting lecture I attended was a presenter from Microsoft itself. More on him and his talk later.

There were so many good things going, and I learned something interesting in nearly every session. Here is a list of my take-aways from the event:
  • jQuery has nothing to do with SQL. I thought I was going to learn a bit about how to query a database source from a web page, but that was not it at all. JQuery is a cool add-on for JavaScript, but it has nothing particular to do with querying databases.
  • SharePoint is much more powerful (and much more complex) than I gave it credit for. I went to two sessions on SharePoint, and I am very impressed with its capabilities. However, I am also quite certain that I am far away from ever tapping its full potential.
  • I picked up the word "trivial" as a descriptor for solutions that do not require much effort to implement -- and the very important "non-trivial" descriptor for solutions that do.
  • I saw Microsoft Azure for the first time, the cloud-computing solution for developers that is currently in development itself but will be rolling out in production next year. It is quite exciting, especially with the cloud-computing possibilities for my own company's applications.
  • I saw Microsoft's Project Gemini for the first time, a new enhancement for Excel that will do for database querying what PivotTables did for spreadsheets. The presenter was one of the developers from Microsoft itself. He shows Excel pulling in millions (!) of rows of data and then analyzing it PivotTable fashion almost instantly. He showed how Gemini can correlate ad hoc data with query data. He showed how SQL Server Reporting Services reports can serve as a data source so that users can re-package their own reports based off of the underlying query that feeds the SSRS report. He blew me away. I can't wait for this feature in the next version of Excel. And I am curious as to whether Excel Server might be a useful tool for some of our customers. It is just so exciting what Gemini is going to bring in terms of user-level Business Intelligence options.
I can't wait for the next Code Camp to come along. I'll definitely be there.

Tuesday, October 20, 2009

To Be Continued

Have you ever wanted to print a "Continued Next Page" message at the bottom of a report? I see this most commonly for contribution statements, but it can be on any report that sometimes requires more than one page per person in the output.

This solution for ShelbyQUERY's Report Designer (which is a variety of Active Reports). It involves three control objects and a one-line VBScript. This solution assumes the following facts about your report:
  • You are using GroupHeader1 as the "per page" grouping level. If you are using another Group level, substitute it for GroupHeader1 as needed.
  • You have changed the NewPage property for GroupFooter1 to 2 - After.
  • The PageFooter section remains as part of the design, and it is where the "Continued" message should print.

To begin, add the following three control objects to the PageFooter section, changing the properties to match those listed beneath each control:
  1. a Bound Control
    1. (Name) = ctlCurrentPage
    2. SummaryGroup = GroupHeader1
    3. SummaryRunning = 1 - ddSRGroup
    4. SummaryType = 4 - ddSMPageCount
    5. Visible = False
  2. a second Bound Control
    1. (Name) = ctlTotalPages
    2. SummaryGroup = GroupHeader1
    3. SummaryRunning = 0 - ddSRNone
    4. SummaryType = 4 - ddSMPageCount
    5. Visible = False
  3. a Label
    1. (Name) = lblContinued
    2. Caption = Continued on Next Page
Put the "Continued" label wherever you want it to print. Because they will remain invisible on the print out, the two bound controls can be anywhere.

Open the Script Editor and change to the PageFooter object. Then change to the OnBeforePrint event. Because the PageCount value works like an aggregate value, this script must be placed in the OnBeforePrint section. Paste the following line of code in between the Sub and End Sub lines, as shown:

Sub OnBeforePrint

rpt.lblContinued.Visible = Not Eval("rpt.ctlCurrentPage.DataValue = rpt.ctlTotalPages.DataValue")

End Sub

The code checks to see if the current page number is equal to the last page number in the segment. If they are the same, the comparison will evaluate to True. If they are different (for any page before the last page), the comparison will evaluate to False. The "Not" reverses this value, so that the "visible" property of the label will be False for the last page of the segment and True for every other page.

In short, you will see that the "Continued" message appears whenever there is more than one page for a given GroupHeader1 segment, but it will never appear on the final page of the segment.

Monday, October 19, 2009

Calculating Age in T-SQL

Wow. October has just flown by. It has been so hectic that I haven't had a chance to post anything for the last few weeks. I will try to make up for that with a few quick posts this week. This first one will be about calculating a person's age based on date of birth information.

In Shelby Systems v.5 database a person's date of birth is stored in the Birthdate column of the NANames table. The examples for calculating age will assume that structure. In addition, the software allows for month/day entries that have no year value included. The datetime data type does not allow this, so those entries are stored with a year value of 1796. Hold that fact in the back of your mind. It will be addressed before the end of this post.

At first blush, the calculation of an age seems quite straightforward, especially if you are familiar with the DATEDIFF( ) function. This is the function used to calculate the difference between any two dates, in any unit of measurement from milliseconds to years. The following query suggests itself as the simple solution:

select
NameCounter = n.NameCounter,
Age = datediff(year, n.Birthdate, getdate())
from
Shelby.NANames as n

Unfortunately, this is not an adequate solution. The reason it fails is the fact that DATEDIFF( ) counts any part of a year as a whole year. Thus someone born on December 31st, 2008 would show up as 1 year old as of January 1, 2009. Obviously, this is incorrect. To correct this, we will need to build in some conditional logic to check whether the person's birthday has occurred yet. If it has not occurred, we need to subtract 1 before displaying the final result. First, let's check the month of the year. If the person's birth month has not yet occurred, we should subtract one.

select
NameCounter = n.NameCounter,
Age = datediff(year, n.Birthdate, getdate()) - case when month(n.Birthdate) > month(getdate()) then 1 else 0 end
from
Shelby.NANames as n

This is much better, but it still leaves open the problem of running this on someone's actual birth month. Then we need to check to see if the birth day has happened yet. This will fix that:

select
NameCounter = n.NameCounter,
Age = datediff(year, n.Birthdate, getdate()) - case when month(n.Birthdate) > month(getdate()) or (month(n.Birthdate) = month(getdate()) and day(n.Birthdate) > day(getdate())) then 1 else 0 end
from
Shelby.NANames as n

Great! So far so good, but what about those 1796 dates? Those people should not have an age showing, because we cannot calculate properly. That requires a new level of CASE logic:

select
NameCounter = n.NameCounter,
Age = case when year(n.Birthdate) = 1796 then null else datediff(year, n.Birthdate, getdate()) - case when month(n.Birthdate) > month(getdate()) or (month(n.Birthdate) = month(getdate()) and day(n.Birthdate) > day(getdate())) then 1 else 0 end end
from
Shelby.NANames as n

That will calculate the age to within a year. I'll have to save the calculation of ages in months for those who are less than a year old for another day.

Friday, September 25, 2009

OnFormat vs. OnBeforePrint in Active Reports

Recently I was asked to assist with an internal report for the Sales and Marketing Department. It might interest you to know that we use Shelby v.5 for our own data management purposes, just like our customers do. The report was put together in ShelbyQUERY and the Report Designer. The report was mostly working fine before I was looped into the project except that there were a few calculated values that were not working as expected. Specifically, the report needed some calculated variances of bound controls that had been aggregated using the SummaryType property of ddSMSubTotal.

In order to calculate the variances, I wrote a VBScript to pull the values for the aggregated Bound Controls in the GroupFooter1 section and to do the necessary math, returning the results of the calculation as the caption property of a label. I put the script to trigger during the OnFormat event, which is the only event I had ever used for Active Report scripts in the body of the document. The process seemed straightforward enough, except that the calculated values that were returned by the script were based on the last scalar value of the field, not the aggregated value.

I almost started down the path of creating script for the Detail section that would sum up the values by leveraging the caption of an invisible label as a place to hold a running sum. But before I did that, I took a step back and looked at other options. One of the options I researched was the event choice for when the script activates. I discovered that the OnBeforePrint event occurs right before the object values are committed to the canvas, but after the object values have been calculated.

That was the "Aha!" moment for me. I changed the script from occurring in the OnFormat event to the OnBeforePrint event. Voila'. The calculations came out based on the aggregate values, just as I wanted to happen. So I learned the following rules:
  • Use OnFormat to work with bound controls before their values are aggregated.
  • Use OnBeforePrint to work with bound controls after their values are aggregated.
If I am not working with aggregated values, I will still prefer the OnFormat event, if for no other reason than it is the default event option.

Incidentally, I also discovered in this process that the way to explicitly reference the value of a bound control is to use its DataValue property. In the past I have used the implicit reference by merely using the bound control's name to stand in for its value. The following two lines of code are functionally identical:

objValue = rpt.Sections("Detail").Controls.Item("Field1")
objValue = rpt.Sections("Detail").Controls.Item("Field1").DataValue

However, because using explicit references is generally better practice than using implicit ones, from this point forward I will give preference to including the DataValue property as part of any reference to the value of a bound control.

Friday, September 18, 2009

Enumerating Options using VBScript

ActiveReports is limited in the ways that a user can input parameters at run-time. At least it is limited in the way it has been implemented for Shelby v.5's ShelbyQUERY application. In fact the only way to capture a parameter at run-time is to use VBScript's InputBox() function. This is serviceable for many contexts, but it can be a problem when the parameter is a long text string, such as the name of an Event from the Registrations application or a company name from the General Ledger. In these cases the user must remember the exact wording of the text and then type it in without any mistakes in order to match the value in the query results exactly.

Error checking can alert the user that the value has been typed incorrectly, but it would be best to help the user along by prompting him or her with what the acceptable values are. In fact it would be ideal to give the user an enumerated list of options, so the user only has to choose the number of the desired value and enter one number instead of typing a long text string. The script below will do just that.

This script should be associated with the ActiveReports Document object and the OnReportStart event. That way it will pop up at run-time and provide a filter for the contents of the report.

After you paste the script into your VBScript editor window, be sure to update the value of the strColumnName variable so that it corresponds to the desired column in your actual query results. Also change the strPromptMessage and strPromptTitle variables so that the InputBox() prompt is worded appropriately for your report.

Sub OnReportStart

Set oDict = CreateObject("Scripting.Dictionary")
Set rsQuery = rpt.DataSource_Shelby_.RecordSet

If rpt.DataSource_Shelby_.RecordSet.RecordCount > 0 Then

intCounter = 1
strCurrItem = ""
strTestItem = ""
strColumnName = "QueryColumnName"
strPromptMessage = "Enter the prompt message here."
strPromptTitle = "Enter the Prompt Title here"

Do
strCurrItem = rsQuery(strColumnName)
If strCurrItem <> strTestItem Then
oDict.Add intCounter, strCurrItem
strList = strList & vbCR & intCounter & " - " & strCurrItem
strTestItem = strCurrItem
intCounter = intCounter + 1
End If
rsQuery.MoveNext
Loop Until rsQuery.EOF

Do
intChoice = InputBox(strPromptMessage & vbCr & strList, strPromptTitle)
If intChoice <> "" and IsNumeric(intChoice) Then
strItemChoice = oDict.Item(CInt(intChoice))
rsQuery.Filter = strColumnName & " = '" & strItemChoice & "'"
If rsQuery.RecordCount > 0 Then Exit Do Else rsQuery.Filter = ""
End If
If intChoice = "" Then Exit Do
Loop

rsQuery.MoveFirst

Else

MsgBox("There are no results to use for this report. Please check the query and try again after you have a set of results.")

End If

End Sub

With this code in your ActiveReports document, your end-user will have a much easier time filtering the report without have to double-check the spelling of a long text value and will avoid frustration of locating a tiny typo in the string.

Thursday, September 10, 2009

Padding Numbers with Leading Zeroes

In some contexts a number needs to be padded in order to look right or to work in the desired context. For example, in the Shelby v5 software, the Company Number and Fund Number values are always displayed as three-digit or four-digit values, with padded zeroes for one- and two-digit vlaues. Thus, company "1" is displayed as either "001" or "0001."

Another example from Shelby v5 is when the NameCounter is used to create the path for an individual picture. Pictures are stored with a file name that includes seven digits for the NameCounter value, so the number is padded with however many zeroes are needed to make seven digits.

To show you how each of these situations can be handled, I will do some sample queries on the Shelby v5 table called CNHst. This table contains contribution history, but that is incidental to this exercise. I am using this table because it contains both a CoNu (company number) column and a NameCounter column.

To start with, here is a basic query to pull the values we eventually want to pad with leading zeroes. An example result set appears underneath.

select
CoNu,
NameCounter
from
Shelby.CNHst









The original values are numeric, and therefore they will always show up with only the digits in the number itself. We cannot "pad" numeric values, so the first step is to change the numeric values into character values, using the CAST() function.

select
CoNu = cast(CoNu as varchar(4)),
NameCounter = cast(NameCounter as varchar(7))
from
Shelby.CNHst









The only visible change is that the values are now aligned to the left of each cell instead of aligned to the right. However, now that the values have been changed into character values, we can add zeroes to the left by a simple concatenation of a literal text string of zeroes.

select
CoNu = '000' + cast(CoNu as varchar(4)),
NameCounter = '000000' + cast(NameCounter as varchar(7))
from
Shelby.CNHst









To make each row the same fixed length of four digits (for CoNu) and seven digits (for NameCounter), we need to cut off the extra zeroes. The best way to do this in T-SQL is to use the RIGHT() function to pick up the string of characters starting with the rightmost character and counting left the desired number of digits.

select
CoNu = right('000' + cast(CoNu as varchar(4)), 4),
NameCounter = right('000000' + cast(NameCounter as varchar(7)), 7)
from
Shelby.CNHst









Now we have uniform output with each number reflected as desired with a fixed number of digits and padded zeroes as needed.

Tuesday, September 8, 2009

Counting Rows

I was travelling last week and didn't get a chance to post, so I'm going to post a "bonus" message today to make up for it. In fact, it was last week when I got an issue of the SQLServerCentral.com newsletter with an article that caught my eye. It was entitled How to Get Table Row Counts Quickly and Painlessly by Kendal Van Dyke. The article points out that the total row counts of all the tables in a database are stored in system tables that may be queried.

I am always interested in getting useful information out of the system tables, especially when it can provide a new perspective on the database itself. If you read my earlier blog entry on finding all the tables that contain particular column names, you probably know that already. The technique in the article is a way to look at all the row counts of all the tables in a list, which could be a great guage to the overall "size" of the database, and it could give insight to sluggish queries, if a table happened to be much larger than expected.

Of course it would be possible to write the simple query to count the rows of one table at a time, like this:

select count(*) from Shelby.NANames

This approach is valid, and it is still the only way to do a count on rows that meet particular criteria, because you can add a WHERE clause to limit the rows. The system table approach always yields the total number of rows, with no such filtering possible.

Nevertheless, getting an overview of the database rows from the system tables can still yield interesting, if not essential, information. Here is the code from Kendal's article, though I encourage you to sign up with sqlservercentral.com and read it for yourself as well.

-- Shows all user tables and row counts for the current database

-- Remove is_ms_shipped = 0 check to include system objects

-- i.index_id < 2 indicates clustered index (1) or hash table (0)

SELECT o.name,

ddps.row_count

FROM sys.indexes AS i

INNER JOIN sys.objects AS o ON i.OBJECT_ID = o.OBJECT_ID

INNER JOIN sys.dm_db_partition_stats AS ddps ON i.OBJECT_ID = ddps.OBJECT_ID AND i.index_id = ddps.index_id

WHERE i.index_id < 2

AND o.is_ms_shipped = 0

ORDER BY o.NAME

For Shelby v.5 users, this code can run straight from ShelbyQUERY, so there is no need to have any special querying software running to check out the database. Try it, and take a peek "behind the curtain" of the database.

Posting SQL Code to HTML Pages

This is just a quick post with a simple heads-up when you want to post query code to an HTML environment, such as a web page or a blog entry. I have learned the hard way that the one symbol commonly used in SQL statements can trip up an HTML post: the "less than" symbol. This symbol is <, of course. It seems innocent enough, but in HTML environments the "less than" symbol is ALSO the "open HTML tag" symbol. An HTML page will expect some kind of tag information after that symbol, and when there is not an expected piece of HTML code after it, the results will be undesirable.

Worse, a blog editor such as blogger may lose the unrecognized text that comes after the symbol, causing the need to redo much of the work of creating the query statement. For example, I just had to redo a post I put up a couple of weeks ago because of a "less than" comparison in the query statement. After I posted it, parts of my code disappeared, and when someone tried to use it, she just got an error message. I only heard from one person, but anyone who tried to use it would have gotten nowhere with it. It's fixed now, so don't worry about having to figure out which one it was.

If you want to publish a SQL statement in an HTML environment, you will need to edit the HTML before you publish it. All you have to do is replace the "less than" symbol with the HTML-friendly code of & l t ; (without the spaces; I had to add spaces so the code wouldn't change into the symbol in this post) instead. You might want to change the "greater than" symbol to & g t ; (again, don't put in the spaces) while you're at it.

Friday, August 28, 2009

Handling Zero Results in a VBScript-Powered Report

If experience has taught me one thing about programming, it is this: expect the unexpected. If you write a report that asks for a number, expect the user to type in a word. If you are looking for a text string, expect someone to leave it empty. Today I'm going to tell you how to handle the unexpected situation of having a query return no results.

In a regular ActiveReports design, this is no big deal. The report just returns empty space where the data would be. However, if you have written VBScript to filter or manipulate the data in any way, then the result of a null set of query results is usually an ugly error message to the user about BOF or EOF, something the average user will blink at and then click OK with no real understanding as to what went wrong.

Luckily, adding a routine to handle a null-resultset situation is pretty easy. The trick is to use the RecordCount property of the ADO object to determine if there are results or not. If there are no results, you can tell the routine to skip all the data filtering and manipulation code and just run the "plain" report, which will still be blank because of a lack of data. As added information, you can pop up a message box to explain why the report will be blank.

Because most data manipulation is done before the report starts rendering, you will probably need to add this code only to the ActiveReports Document object and the OnReportStart event. However, if you are doing some further data manipulation in a later section of the report layout too, then add it as needed to each script in the report design.

Here is the sample VBScript code for a Shelby v5 Report design. The ADO object is called DataSource_Shelby_ (Yes the final underscore is part of the name. Why? I don't know. It's a mystery).

Sub onReportStart

If rpt.DataSource_Shelby_.RecordSet.RecordCount > 0 Then

{insert your data filtering or manipulation code here}

Else

MsgBox("There are no results to use for this report. Please check the query and try again after the query returns results.")

End If

End Sub

Once this code is in place, even when changes in the data or changes in the query mean that no results are available, the user will not be presented with a geek-speak message, only a warning and a blank report.

Wednesday, August 19, 2009

Querying the Schema

Since I didn't get to post last week, I'll post twice this week to make up for it.

Today I was presented with a problem in which a customer wanted to know which tables had been updated by a particular person in a particular time frame.

Immediately I knew that he would need to find out the values of the WhenUpdated and WhoUpdated columns, which are found in many of the tables of the Shelby Systems v5 database schema. The problem was that there was no way of predicting exactly which tables had been affected in the given time frame. In addition, there was no way of know exactly which tables existed in the customer's database, at least not without doing some digging or getting a copy of a full backup of the customer's data. Both of these options would take more time than was really called for.

What I needed was a query that would let me get a list of the tables that contained one or both of the columns I was interested in finding. This is called a "meta-query" because it queries the database schema instead of the regular data storage tables. Of course, this being SQL Server, even the schema is stored in tables. Specifically, the information about the columns in the tables is stored in a table called information_schema.columns.

I whipped out the SQL Cookbook by Anthony Molinaro, which is a superb resource for pushing the ordinary limits of SQL code, and I quickly found the entry on querying the schema to list column names in a given table. I simply reversed the conditions of the query to list the tables that contain a given column. The result was this:

select distinct
table_name
from
information_schema.columns
where
table_schema = 'Shelby' and column_name in ('WhoUpdated', 'WhenUpdated') and table_name not like 'vw_%'
order by
table_name

Running this code gives a list of all the tables that have either WhoUpdated or WhenUpdated. Because all of the tables that have one also have the other, the column_name condition is a bit redundant, but it shows how you can search for matches on one or more column names.

When the customer runs this query, it will list for him only the tables that are in his database installation, so it saves me having to figure out which tables he has and which he doesn't. Once he has the list of tables, it is a matter of running a query on each table to find a match on the date or the name of the last update information.

What is also great about this query is that it can be run straight from the built-in ShelbyQUERY module. There is no need to use the SQL Server Management Studio or any other outside querying mechanism.

P.S. If you're wondering, yes, there is an information_schema.tables table that stores information about the tables per se.

Monday, August 17, 2009

Listing Husband and Wife on One Row - Part 3

I didn't get a post in last week. I have been busy reviewing and revising the Fall 2009 training season workbooks and trying to get SSTips ready to send out as well. So today I am finally getting around to posting part 3 of the series on putting names together on one row. Today we'll look at how to put the names together into one column as a combined name. This is tricky because of the possibility of different last names, as well as the fact that sometimes spouse information is not available even when we know that the person in the database is married. Another factor is that the gender of the head of household may be male or female. And I'm not even going to get into handling special titles - such as Dr. - because that would add another layer of complexity to the basic process of combining names.

The following code puts two names together. It limits the results to those who have a TESTGRUP profile code, the same as in the previous parts of this series.

/* query begins */
select
SelectedName = SelectedNames.FirstMiddle + ' ' + SelectedNames.LastName,
Spouse = Spouses.FirstMiddle + ' ' + Spouses.LastName,
CombineIfMarried =
case /* This begins a conditional tree. The top branch of the tree is to check that the selected person is married. */
when SelectedNames.MaritalStatus in ('M','R') then
case /* The next branch is to check the gender of the selected person. */
when SelectedNames.Gender = 'M' then
case /* The bottom banch determines if the person and the spouse have the same last name. If there is no spouse info in the database, both comparisons will fail and the "else" condition applies. */
when SelectedNames.DifName + Spouses.DifName = 0 then Spouses.FirstMiddle + ' & ' + SelectedNames.FirstMiddle + ' ' + SelectedNames.LastName + isnull(' ' + SelectedNamesSuffixes.Descr, '')
when SelectedNames.DifName + Spouses.DifName = -1 then SelectedNames.FirstMiddle + ' ' + SelectedNames.LastName + isnull(' ' + SelectedNamesSuffixes.Descr, '') + ' & ' + Spouses.FirstMiddle + ' ' + Spouses.LastName
else 'Mr. & Mrs. ' + SelectedNames.FirstMiddle + ' ' + SelectedNames.LastName + isnull(' ' + SelectedNamesSuffixes.Descr, '')
end
when SelectedNames.Gender = 'F' then
case
when SelectedNames.DifName + Spouses.DifName = 0 then SelectedNames.FirstMiddle + ' & ' + Spouses.FirstMiddle + ' ' + Spouses.LastName + isnull(' ' + SpousesSuffixes.Descr, '')
when SelectedNames.DifName + Spouses.DifName = -1 then Spouses.FirstMiddle + ' ' + Spouses.LastName + isnull(' ' + SpousesSuffixes.Descr, '') + ' & ' + SelectedNames.FirstMiddle + ' ' + SelectedNames.LastName
else 'Mr. & Mrs. ' + SelectedNames.LastName
end
else 'Mr. & Mrs. ' + SelectedNames.LastName
end
else SelectedNames.FirstMiddle + ' ' + SelectedNames.LastName + isnull(' ' + SelectedNamesSuffixes.Descr, '')
end
from
Shelby.NANames as SelectedNames inner join
Shelby.NAProfiles as Profiles on SelectedNames.NameCounter = Profiles.NameCounter and Profiles.Profile = 'TESTGRUP' left join
Shelby.NANames as Spouses on SelectedNames.FamNu = Spouses.FamNu and Spouses.UnitNu = case when SelectedNames.UnitNu < 2 then abs(SelectedNames.UnitNu -1) end left join
Shelby.NASuffixes as SelectedNamesSuffixes on SelectedNames.SuffixCounter = SelectedNamesSuffixes.Counter left join
Shelby.NASuffixes as SpousesSuffixes on Spouses.SuffixCounter = SpousesSuffixes.Counter
/* query ends */

The CASE statement to put the names together has three levels, as noted with the comments in the statement. The first level divides those who are married from those who are not. The second level divides the people selected into male and female, and the third level divides those spouses with different last names.

Tuesday, August 4, 2009

Listing Husband and Wife on One Line - Part 2

Last week I posted on how to put both husband and wife on one line by joining to the names table (NANames for Shelby v5) multiple times. That is a useful technique when you want to show the spouse information for every name that is part of the desired result set. However, a different technique is needed when you want to show spouse information only when both spouses are part of the desired result set. A typical example might be a label report for the choir or another group in which one or both of the spouses might participate, and when both spouses do participate, they should both be combined. When only one spouse participates, the other should not be part of the results.

This new method selects the names as you would normally, joining only once to the names table. But then add a GROUP BY clause to the query, grouping by the common factor for the family (NANames.FamNu for Shelby v5). After you group the results, you can use a CASE statement to select either the Head of House or the Spouse name to appear in the results.

In the example query below I have selected all the adult choir members by using an inner join to the NAProfiles table and at the same time limiting the included Profile values to those that begin with MUAC, which is a common prefix on all the adult choir profile codes. The query uses GROUP BY and CASE statements to pull the spouses together and list them in the appropriate columns of the results.

select
HHFirstName = max(case when Names.UnitNu = 0 then Names.FirstMiddle end),
HHLastName = max(case when Names.UnitNu = 0 then Names.LastName end),
SPFirstName = max(case when Names.UnitNu = 1 then Names.FirstMiddle end),
SPLastName = max(case when Names.UnitNu = 1 then Names.LastName end)
from
Shelby.NANames as Names inner join
Shelby.NAProfiles as Profiles on Names.NameCounter = Profiles.NameCounter and Profiles.Profile like 'MUAC%'
group by
Names.FamNu

Here are the results:

You can see that the Blackwells , the Carters, and the Glovers are in the choir together. In the visible results all the other individuals do not have a spouse in the choir.

With the two techniques from last week and this week, you should be able to get the names of spouses together whenever you need them in one row of results. Of course, listing them on one row is usually only part of the requirement. Often you will want to combine them together into one column as well, with a result such as "John & Jane Doe" or "Mr. & Mrs. Smith." That process is surprisingly complex, and will be the topic for next week's post.

Tuesday, July 28, 2009

Listing Husband and Wife on One Line - Part 1

There are several ways to combine a husband and wife in query results, each one offering an advantage over the others depending on the overall design of the query. The first method I want to throw out there assumes that the Head of House will be the "primary" name chosen by the WHERE condition of the query. Adding the spouse involves joining a second time to the names table (NANames for Shelby Systems v.5).

Before examining the query statement itself, here are the noteworthy columns of the NANames table when it comes to combing family members together:
  • NameCounter - a numeric value that is unique for every person in the table
  • FamNu - a numeric value shared by all members of the same family; it is helpful to realize that this value is the same as the head of household's NameCounter value.
  • UnitNu - a numeric value indicating the family position in the household; the head of house value is zero (0) and the spouse value is one (1).
The following query lists all the heads of house in the NANames table and lists the spouses' names along side.

select
HeadOfHouse = HH.FirstMiddle + ' ' + HH.LastName,
Spouse = SP.FirstMiddle + ' ' + SP.LastName
from
Shelby.NANames as HH left join
Shelby.NANames as SP on HH.FamNu = SP.FamNu and SP.UnitNu = 1
where
HH.UnitNu = 0 and HH.TitleCounter > -1

The UnitNu condition in the WHERE clause is there to prevent the spouses and children from being listed on a separate line. The TitleCounter condition is there to eliminate businesses and other organizations from the list.

A simple variation of this approach allows you to add the spouse of either husband or wife, depending on who is included by the WHERE criteria. In the following example the WHERE condition is not given, but the FROM condition will add the spouse of whichever name is part of the result list.

select
HeadOfHouse = ChosenNames.FirstMiddle + ' ' + ChosenNames.LastName,
Spouse = Spouses.FirstMiddle + ' ' + Spouses.LastName
from
Shelby.NANames as ChosenNames left join
Shelby.NANames as Spouses on ChosenNames.FamNu = Spouses.FamNu and Spouses.UnitNu = ABS(ChosenNames.UnitNu - 1)
where
any conditional statements

The join to the "Spouses" instance of NANames uses a mathematical operation to return a 1 whenever the primary name has a UnitNu value of 0 and to return a 0 whenever the primary name has a UnitNu value of 1. Thus the spouse will be added to any head of house, and the head of house will be added to any spouse.

Both variations of this approach use a LEFT join to add the spouse information. This type of join means that all the primary names will be listed, regardless of whether there actually is a spouse in the system for that person or not.

Tuesday, July 21, 2009

Existential Queries

Recently I was asked to come up with a query that would select names of people who have one kind of record in a given table but from that group omit people with a different kind of record in the same table.

For those familiar with the Shelby Systems v.5 software, I needed to pull people with a particular Profile code but of that group omit people with a different Profile code.

The only method I could find to meet both condition was to use the EXISTS( ) function. This function returns a boolean true/false value based on whether or not a specified subquery returns any results. If the subquery returns results, the value is true. If it does not, it is false. As with most functions like this, the value can be reversed by the key word NOT.

I had run across EXISTS( ) some time ago, and I have used it periodically to simplify some query conditions. I'm glad I had it in my arsenal, because it was the only way I found to solve the problem of "find everyone who has one Profile, but not if he has specific second Profile too."

Here is the basic approach I took to solving this problem with a query, simplified to a minimum number of column references:

select
NameCounter, FirstMiddle, LastName
from
Shelby.NANames
where
exists (select * from Shelby.NAProfiles where Shelby.NAProfiles.Profile = 'value1' and Shelby.NAProfiles.NameCounter = Shelby.NANames.NameCounter) and not exists (select * from Shelby.NAProfiles where Shelby.NAProfiles.Profile = 'value2' and Shelby.NAProfiles.NameCounter = Shelby.NANames.NameCounter)

In regular English, the query says to pull the names of those who have a Profile of 'value1' but who do not also have a Profile of 'value2.' There may be other ways to achieve this kind of result, but this one worked for me.

Notice that the subquery's SELECT clause uses the asterisk (*) to stand in for all columns. Because the subquery is only used as a test to see if even one row is returned, there is no need to be specific about which column to select; the asterisk works fine.

Monday, July 13, 2009

Comparing Dates in SQL

I found out the hard way (read: customer calling in to complain that the query I wrote him was omitting some events from the calendar query I wrote for him) to NEVER use basic math comparisons (=, <, >) for dates; at least, I have learned not to use them without DATEDIFF( ). It is more difficult sometimes, but it is going to render the correct result every time, with the other comparison operators will not.

The problem is that dates are not stored merely as dates but as datetime values. This means there is an hour, minute, second, and millisecond value associated with every date, even if that value is ignored by the user interface. In most cases this is not going to cause a mistake when comparing two different dates, but it does cause a problem when one is trying to find equivalence between two instances of the same date.

Here is the underlying "gotcha": When the time of day does not matter, most databases store dates as if they were at midnight of the day in question whereas the GETDATE( ) function returns the date and current time. Any calculations based on the GETDATE( ) value also have the current time. And because 1/1/2009 00:00:00 is technically less than 1/1/2009 15:39:15 (or any other time after midnight), those two values will not satisfy a comparison using "=" even though they signify the same calendar day.

Here is a simple CASE statement showing the use of DATEDIFF( ) in conjunction =, >, or <.

CASE
WHEN DATEDIFF(day, date1, date2) = 0 THEN "The dates are the same."
WHEN DATEDIFF(day, date1, date2) > 0 THEN "Date1 is earlier than Date2."
WHEN DATEDIFF(day, date1, date2) <>


Using an approach like this will always give you the results you want, without any problems from the time element of the datetime value.

Monday, July 6, 2009

Reporting Services and check boxes

Most of the reports I make are created using a slimmed-down version of Active Reports, which has been integrated into Shelby Systems v.5 software as part of the ShelbyQUERY tool. Active Reports has a "checkbox" control object, which displays a checkmark in a box when the value of the field is anything other than "false," or an empty box when the value is "false." This can come in handy for reports with lots of Boolean (true/false) values.

Interestingly, even though SQL Server Reporting Services is much superior to Active Reports in many respects, it is lacking in a "checkbox" object. This seems to me to be a significant oversight, but it not completely insurmountable.

Although there is no "checkbox" object in Reporting Services, you can emulate one using a TextBox, if you are willing to have an "X" instead of an actual check mark in the box. Just do the following.

Create a new text box, put an "X" in it, and change the following properties to the listed values:

  • BorderStyle: Solid
  • Color: = IIf (Fields!MyField.Value, "Black", "White")
  • Size: .175in, .175in (this fits a 10pt font; make it larger for a bigger font)
With these setting the box will have an "X" in it for true values and be empty for false values. I haven't tried it yet, but I suspect it might even be possible to use a font like Wingdings and actually print a checkmark in the box instead of just an X.

Followers