A technical blog about my projects, challenges, and discoveries in the world of data warehousing using SQL Server, Power BI Desktop, DevExpress, and more.

Tuesday, October 18, 2011

Top Ten Things I Learned at SQL PASS Summit on 10/14/2011

Travel and busy-ness with my baby Destiny have prevented my posting this last installment in my list of top lessons learned at SQL PASS, but now here it is:

1. Buck Woody and Rob Farley are not only superstars in SQL technologies, they are talented as singer/songwriters too. Check out the video of the opening of the keynote session for proof. Cue it up at 00:02:27 to skip to the song. Their song was my favorite moment of the week.

2. On Wednesday I wrote that I wanted to learn more about Hadoop since it was part of a new feature of SQL Server 2012, and today I learned more about it than I really wanted to know, but the presentation by Dr. DeWitt was fascinating. Hadoop is a method of "divide and conquer" applied to massive amounts of unstructured (or more accurately loosely structured) data, the kind collected by websites such as Facebook, eBay, and Google.

3. If you add NOSPLASH after the path to SSMS executable in the shortcut properties, the application will open a few seconds faster. It does not seem like much, but it's a handy tip that I put in place right away.

4. If there are more than a couple of OR conjunctions in any given WHERE clause, using UNION ALL is probably a better approach. The query optimizer in SQL Server is much better at streamlining two or more small query conditions rather than one huge, complex one.

5. When possible, replace correlated subqueries in the SELECT clause with joins in the FROM clause. This is because the correlated subquery must execute once for every row in the main results, eliminating any optimization that is possible when the data are correlated in the FROM clause.

6. I learned how to create a "temporal snapshot fact table" for a data warehouse solution. This improves the efficiency of a "daily summary" fact table by using date ranges instead of individual days as the granularity level, ranges during which nothing changes in the data. I suspect this may come in handy when constructing fact tables for pledges and donations. It is definitely a nice concept to keep in the toolbox of report design.

7. I learned that the query optimizer can anticipate the best query plan even when part of the query statement has been "refactored" into a common table expression (CTE).

8. Thanks to Audrey Hammonds, I learned some great techniques for explaining CTEs effectively. I plan to use some of her presentation style and content ideas when I present an introduction to CTEs at SQL Saturday #96 in Washington, DC.

9. CROSS APPLY is comparable to an inner join and OUTER APPLY is comparable to a left join. In that same context, I learned that there is no difference in query efficiency to move a CROSS APPLY table valued expression into a correlated subquery within the SELECT clause. This is reassuring, since I have already made such a move with the FOR XML technique of turning row values into delimited strings.

10. Snow Crash by Neal Stephenson is a book I need to read. When Audrey referenced it in her presentation, about half the audience recognized it and praised it as a great read. She mentioned a few other books throughout her talk, but this one definitely stirred the strongest positive response.

Friday, October 14, 2011

Top Ten Things I Learned at SQL Pass on 10/13/2011

The learning continues...

1. The "AlwaysOn" feature of SQL Server 2012 greatly simplifies high availability through redundant servers and simultaneously optimizes reporting by allowing you to use secondary servers for reporting and the primary server for transaction processing. However, it is currently unknown whether or not AlwaysOn will be in any edition other than Enterprise. It seems likely that it is an Enterprise-only feature.

2. The "ColumnStore Index" feature in SQL Server 2012 could radically speed up queries against tables in the Shelby v.5 database, especially attendance history. I can't wait to try it out. I may install Denali to try it out even before the official release of 2012.

3. There is a version of SQL Server Express coming out with 2012 that can easily be embedded into applications. This could potentially be a way to simplify the distribution of Shelby v.5 for those customers who do not own a license of SQL Server Standard. I will be looking into it more back home.

4. The "Semantic Model" features of SQL Server 2012, when coupled with the "FileTable" feature make for a powerful combination, capable of doing searches for document content based not just on matching of strings but actually pairing of concepts. I was blown away by the demo. Any librarian in the world would have been salivating over the processes and results I saw.

5. Two more times in one day I was shown the fallacy of putting dates inside of a function in the WHERE clause. I will never do it again, I promise! But I also learned that datetime and smalldatetime are being depricated in SS2012, meaning that we should probably start using the other, new data types for time. I will pass that info along to our database developers.

6. I learned that by simply changing the varchar data type to nvarchar, we could be capturing special characters for names, addresses, and more, which would make the v.5 database friendlier to international users. I'm going to do some testing on this when I get home too, to verify what is happening in our db right now when someone puts special characters into the varchar fields.

7.  I need to kick the habit of not specifying the length of varchar values and others when I use them as variables. I've been snakebit by that one before, and what I learned today just reinforces that I need to specify length 100% of the time.

8.  Some people, no matter how smart they are about what you want to discuss, can just be too annoying to talk to. It's just better to bide your time and find someone who may not be as knowledgeable but who is willing to listen and to converse at a pace that matches your own.

9.  The 2013 Summit will be in Charlotte, NC. That will be much more convenient for me to attend, and I am very much looking forward to not having to adjust to a new time zone in order to participate.

10.  Making free form reports with PowerPivot data in the background is not only possible, it allows for greater control over formatting and the embedding of elements such as sparklines into the data results. Free form reports can even use slicers to control the content. The only downside is the fixed nature of the rows and columns, which is why PivotTables are so handy in the first place.

Thursday, October 13, 2011

Top Ten Things Learned at SQL PASS on 10/12/2011

Wednesday is when the conference starts in earnest, and so I definitely learned more than I can summarize here, but these ten things really stood out to me. As with my first list, this is roughly chronological over the course of the day.

  1. There are over 4,000 registered attendees at this event. Add in the presenters and the vendors, and you can imagine how packed the rooms are. It is good to have such a thriving community; and yet, it does strain the capacity of the venue.
  2. SQL Server Code Named "Denali" has officially been named "SQL Server 2012," and is expected to be released in the first half of next year. "Crescent" has officially been named "PowerView" and will be a part of the SS2012 release, but apparently it requires the Enterprise edition. This is unfortunate, because that will put it in reach of only a couple of our Shelby Systems, Inc. customers.
  3. SQL Server 2012 will provide connectors and (eventually) drivers for something called "Apache Hadoop." This is something I will need to learn more about. Apparently Hadoop is a way to consume gigantic amounts of essentially unstructured data (on the order of pedabytes) and generate meaningful output. I don't know that I'll ever need that in my current role at Shelby Systems, but it is an interesting feature.
  4. I learned several things during the demos in the keynote, including: "sentiment" is now a measureable fact that can be quantified and reported on through the Microsoft Data Marketplace and the upcoming "Data Explorer"; only John Wayne has made more movies than Samuel L. Jackson.
  5. Stephen Few is a name I heard multiple times with regard to optimizing report style and layout. I believe I'm going to have to look into his web site at www.perceptualedge.com and check out his books.
  6. If I can get ahold of some mapping files that include the ZIP code layer (the Census Bureau is said to have some in the public domain), I should be able to get the mapping feature in Reporting Services to map down to that level. That would be much better than the current limit of reporting at the state level.
  7. Satori Software (a vendor here as well as a partner with Shelby Systems, Inc.) could actually return latitude and longitude information back to the NAAddresses table if we added that to our service agreement with them. The additional licensing cost makes it unlikely that this feature would be a part of all our customer's ShelbyMAILROOM package, but it is interesting to know.
  8. The inactivated slicer problem I noticed in the Excel PivotTables that are created programmatically from ShelbyQUERY is probably triggered by some option in the automated set up of the table. A manually-create OLE DB connection can use slicers just fine. It appears to have something to do with an OLAP-related setting, but I'm going to have to follow up on this more with Microsoft. Helpfully, one of the PowerPivot techs gave me his contact info and invited me to e-mail him if I couldn't pinpoint it on my own. I have to give props to the Microsoft staff here at PASS, who are as helpful as they can be.
  9. I have been doing date filtering in the WHERE clauses of queries in a way that will impede the efficiency of the query, possibly adding multiple seconds of time to getting the query results.  I have to learn ways to avoid putting table columns into functions, even the DATEDIFF() function.
  10. The pie chart is not an effective way of communicating comparisons in reports. Bar charts are far an away the best chart for most comparisons. Gauges too are relatively ineffective communicators. But the bullet chart is exceedingly good at communicating "KPI" style data with more context. I will have to learn how to create them in Reporting Services.

Wednesday, October 12, 2011

Top Ten Things I Learned at SQL PASS on 10/11/11

More lessons from the SQL PASS Summit 2011:

  1. When your laptop is a block or more away, that is the point at which you will most need it urgently.
  2. It is quite a good feeling to be an "alumnus" of the event and be able to help the first timers with getting where they need to be and oriented to the essentials for the best PASS experience.
  3. Although the views I have built for use in the ShelbyQUERY environment of our software are helpful for simplifying query writing, they are not really "data warehouse" structures. They blend too many distinct "dimensions" of data together, and they confuse (at points) the dimension/fact dichotomy. Thus I cannot really use that work as the guideline for a data warehouse database design.
  4. A dimension that is a "role playing" dimension does not have to exist as multiple instances in the data warehouse, not even as views of the different roles. If two keys in the fact table reference the same dimension for distinct purposes, it is a "role playing" dimension.
  5. Pragmatic Works, a SQL consulting firm based in Jacksonville, Florida has actually designed a data warehouse solution for a large Baptist church in that state. I'm sure my jaw dropped when Brian Knight (founder of Pragmatic Works) told me this in today's precon session. How could this be, and I didn't know it?! I will be meeting with Brian at the Pragmatic Works booth later this week to find out more.
  6. The topic "Building a Microsoft Business Intelligence Platform" sounds too ambitious to be effectively covered in a one-day training session because it actually is. Yes, Brian and Devin (the presenters) touched on each phase of the process, but the touches were alternately too light and too technical, leading to frustration on my part and (I suspect) on the part of others in attendance. It was often a case of lingering too long on a technical detail of the process so that the presenters rushed the remaining steps in the current topic to move on to the next.
  7. Subway stores in Seattle do not stock provolone cheese and do not accept the points card that is so handy in the Subway stores in Virginia. On the other hand, they do have the caloric values of all the sandwiches and other menu items displayed on the menu.
  8. There is an interesting similarity in the number of languages one must learn to be a biblical scholar and to be a SQL Server developer. To be a biblical scholar, one must learn Hebrew, Greek, Aramaic, German, and Latin (at least). To be a SQL Server developer, one must learn T-SQL, MDX, DAX, VB.net, and the proprietary "expression language" of SSIS and SSAS. And, just like the languages of Hebrew and Aramaic, they can be similar enough to be confusing when you are trying to learn them all.
  9. The PASS Virtual Chapter for Business Intelligence has four to five live webinar-style presentations a month, generally over the lunch hour, and these webinars are then posted for free as recorded webinars the next day. I will definitely be joining this "chapter" of PASS to take advantage of the resources it provides and (perhaps down the road) to find a way to contribute back to the SQL community that has so helped me in my work these last couple of years.
  10. There are people here at PASS who are familiar with Shelby Systems, Inc, even apart from the Pragmatic Works folks who worked with our customer in Florida. I met a man at lunch today from Houston who attends a church where the staff uses Shelby v.5. He volunteers as IT support at his church of about 1000 members. They have been users of v.5 since before the converstion to SQL Server from MS Access, and he confirmed that the move was a tremendous improvement in the stability of the database.

Monday, October 10, 2011

Top 10 Things I Learned at SQL PASS on 10/10/2011

I have the priviledge of being here in Seattle for the SQL PASS Summit conference all week, including two days of "pre-convention" all-day training topics. Today was the first day of the conference, and I wanted to continue a tradition of summarizing the top ten things I learned today at PASS. These are in roughly chronoligical order.
  1. It is good to get back on a treadmill and work out, even though a four-month break means that I could only go 2.6 miles in 45 minutes instead of 3.1 miles in about 40 minutes.
  2. The scuttlebutt I heard last year is true: the Paramount Hotel is indeed a great place to stay for PASS, as it is reasonably priced and only a block or so from the convention center. And most of the walk there is under covered awnings. If you are familiar with Seattle weather in October, you know this is no small fact.
  3. Talking to and singing to my almost-three-month-old daugher Destiny over the phone is a pale substitute for doing those things with her in my arms, but even through the phone her baby talk can melt my heart.
  4. Apparently SQL PASS Summit attendees receive a new backpack every year.
  5. The Guidebook app for Android phones (and probably iPhones too) is a great tool for an event such as this conference, and it is definitely something we should consider using for the International Shelby Conference next year.
  6. In SQL Server Reporting Services, if you use a "cast" function (such as CDate or CStr) on a textbox value to give it a specific data type, it opens up a large set of VB.net methods that can make working with that value much easier. This is especially true for datetime values.
  7. I really need to become comfortable with creating and using stored procedures over plain Jane SQL queries and views. They offer distinct benefits, which appear to be worth the trouble of being unable to create them easily with ShelbyQUERY.
  8. In Reporting Services it is possible to create code blocks that do more involved VB.net functions than can be summarized into a single one-line of code in an expression. I knew this had to be the case, but today I learned how to make that happen.
  9. Using a "post back" concept in the Action property of a Reporting Services report element creates the ability to have interactive "buttons" that change the parameters of the report content; for instance, it is possible to add "slicer" style buttons to a chart or matrix report design.
  10. A walk from the hotel to Starbucks to get a refreshing shaken iced tea lemonade takes just enough time to wait out a false fire alarm at the hotel and return to see the firefighters climb into the fire engine and head out.