A technical blog about my projects, challenges, and discoveries in the world of data warehousing using SQL Server, Power BI Desktop, DevExpress, and more.

Saturday, November 20, 2010

Transactional Replication: A Journey into the Unknown Part 2

I worked some more on this transactional replication process this morning. I discovered that the snapshot creation problem was being caused by one or more "views" that included select * somewhere in them. I eliminated all my personal views and left only the "official" views created automatically by the Shelby v.5 system, and the snapshot creation process completed normally.

After this things went smoothly until I tried to initiate the first replication process, at which point I started getting errors saying that the "replication subsystem failed to load" and also that the process as "rejected" because the job was "suspended." It took more digging, but I discovered that all I needed to do was to restart the SQL Server Agent service. After I restarted it and tried again, the synchronization process "started successfully."

As I type this the first synchronization is still in progress. Its been running for several minutes, and I'm not sure when it will complete. I'm going to let it keep running and I'll post again once I have more to report.

For now, here are three things I learned about implementing Transactional Replication that was not included in the 1-2-3 steps of the SQL Server Management Studio wizard:


  1. Make sure the SQL Server Browser service is running on the Principal/Distribution server(s).
  2. Do not include any views as replication articles if they include select * anywhere in them.
  3. After setting up the Subscription on the Subscriber server, restart the SQL Server Agent.

Knowing these three points would have helped me avoid several minutes of hunting down the causes of problems and then following several trouble-shooting steps. I'm sure there will be more to learn ahead.

Transactional Replication: A Journey into the Unknown Part 1

As excited as I am about the potential that an OLAP database would offer for reporting and other business intelligence solutions, I discovered yesterday that with regard to SQL Server concepts, high availability is currently more on the minds of our development team. As a result I have started exploring concepts such as replication, mirroring, fail-over databases, and such.

After reading over all the options, I have decided to try to create a transactional replication solution as a test case. My desktop will be the primary server (where the transactional database lives) and also the Publication server (where the database is "staged" for replication). My laptop will be the Subscription server (where the remote duplicated database lives). 

After installing SQL Server Management Studio and the Replication tools onto my laptop, I ran the SSMS wizards on both my desktop and my laptop. The desktop SSMS installation ran the Publication wizard without a hitch, excecpt that the SQL Agent didn't run. I started the SQL Agent service manually after the fact. At that point, everything appeared normal.

Then I ran the subscription wizard. I had trouble getting the SSMS on the laptop to connect to the desktop server instance until I figured out that the SQL Server Browser service was needed but was not running. I started the browser, and then the laptop connected to the desktop like a champ. I walked through the wizard without any trouble until I got to the message that the initial snapshot of the source database needed to be made before I could continue.

I checked the desktop SSMS replication properties to verify the existence of the snapshot, only to instead find out that the snapshot did not exist. I tried to run the SQL Snapshot Agent manually, but I started getting errors during the process. Something about GROUP BY and the need for LEFT OUTER JOIN. 

Well, this took me a couple more hours than I was expecting, so I will have to leave the troubleshooting of the Snapshot Agent for another day. But once I figure out what is happening with that, I'll post the next chapter in the transactional replication adventure.

Tuesday, November 16, 2010

Top 10 Things I Learned Thursday at SQL Pass Summit

Sorry for the delay in posting this last Top 10 list of things I learned at the SQL PASS Summit in Seattle. Here is the final list:

10. The Paramount Hotel is probably where I need to stay next year if I get to come back.
9. A data mart/data warehouse solution would answer a lot of the temporal reporting challenges I have been unable to solve with straight T-SQL against the OLTP database.
8. The query execution process turns out to have three layers: the visible "typed" layer, the logical layer, and the physcial layer. I was only aware of the first two before.
7. Query optimization is more arcane and difficult than I would have ever thought (not that I ever thought about it).
6. The "shared data sources" and "mini-charts" in Reporting Services 2008 R2 will be a great help for self-service reporting.
5. The T-SQL function APPLY allows the query to apply a calculated table column or function to every row of the "left" table.  Believe me, it is much more useful than it sounds.
4. "ETL" means "extract, transform, and load."
3. I need to learn how to use the ETL functions in SQL Server Integration Services.
2. A "fact table" is a table comprised of measureable (i.e. usually numeric) "facts" along with a set of keys to to the facts to "dimension tables" comprised of descriptive (i.e. usually non-numeric) "dimensions."  Together these two tables form a "star schema," the fundamental structure of data marts and data warehouses.
1. I need to learn how to create a data mart for the Shelby v.5 OLTP database.

As a result of SQL PASS Summit 2010, you will be reading a lot more about PowerPivot, data marts, OLAP cubes, and more. I certainly have enough to learn to keep me busy until next year's Summit!

Thursday, November 11, 2010

Top 10 Things I Learned at SQL PASS Summit Today (Wednesday)

10. The joy of having the ability to play arcade games an unlimited number of times all night long is somewhat tarnished by two factors: a) the hundreds of other people in the same arcade with equally unlimited number of times to play and b) I am no longer 12 years old.
9. What I do is actually a hybrid of T-SQL development and BI development.
8. It is an interesting feeling being the SQL novice in every conversation during the conference.
7. I should never post proprietary code on my blog. I don't think I've done that exactly, but the references to the Shelby v.5 database might be a gray area on that one. I am going to have to reconsider my posting style, and I may change how I handle code that is specifically for Shelby v.5 data.
6. I really need to learn the DAX "language" to make PowerPivot a truly potent reporting tool.
5. PowerPivot has no innate "grouping" ability. This means that the query tables I make for PowerPivot usage need to have grouping columns. A little planning ahead will go a long way in making that easier.
4. SharePoint is growing in importance as a collaborative platform in the Microsoft world, meaning that it will be a necessary tool for managing BI tools.
3. I can set up a SQL Azure account for just under $10 a month to have my own little database-in-the-cloud architecture to play with.
2. Reporting Services uses VB .Net for its expression language.
1. I have not even scratched the surface of the amazing things Reporting Services can do.

Tuesday, November 9, 2010

Top Ten Things I Learned at SQL PASS 2010 Today

10. The food served at dinner leaves much to be desired.
9. Report Designer v.3 (in SQL Server 2008 R2) has a mapping feature that can be really cool, but needs a lot of massaging to be truly useful.
8. Building a cube in Analysis Services works very easily if the data are stored in a "star schema." I don't know what that is, but I intend to find out.
7. The PowerPivot tool in MS Excel 2010 is not innate to Excel and must be downloaded and installed separately.
6. The PowerPivot tool is a basic cube technology, so it may not be necessary to learn how to make a "star schema" in order to use cube-style analysis on Shelby v.5 data.
5. Never assume that room 2 is immediately adjacent to room 3.
4. Recursive Common Table Expressions, which I have only recently discovered how to use, should be avoided because of the overhead they add to a query.
3. Tally tables are a way to avoid recursive CTEs.
2. SQL Server "Denali" is going to be available tomorrow for public review of the beta code.
1. SQL Server "Denali" includes "Crescent," a tool that will take PowerPivot ease of use and put it into  Reporting Services, along with a slew of new graphical representation report tools that are simply mind-blowing.

Tuesday, November 2, 2010

MsgBox Response Values Not Returned

I ran into a very odd problem today. I was programming some VBScript into a ShelbyQUERY report design, and I wanted to give the user a Yes/No option using a MsgBox() function. Normally I just use MsgBox() to inform the user of things or to double-check my own work during development by popping up variable values during execution. So normally the only button to click is "OK," and the program resumes after the click.

But today I wanted to know if the user clicked "Yes" or if he clicked "No," so I changed the parameters and tried to capture the value of the click. But I started to get unexpected results, and I got the same thing if I clicked "Yes" or "No," either one.

After doing some digging and double-checking the value being returned from the MsgBox() buttons, I discovered that no values were being returned. I tested this with a much simpler case that was not part of the main script, and sure enough, it verified that the MsgBox() function is returning no values to the main script.

I believe the problem is not with VBScript per se, but with the implementation of it in the Report Designer (the Shelby v.5  flavor of ActiveReports). InputBox() works just fine, but MsgBox() does not, apparently.

So, when you are writing VBScript for the ShelbyQUERY Report Designer, my advice to you is to use MsgBox() only for informational pop-up messages and not for capturing responses. InputBox() is the only way to capture a response from the user, even when you just need a "yes/no" response.

Monday, November 1, 2010

Using Variables in T-SQL

Although there is no way to create a "pop-up" requester using Transact-SQL on its own, there is a way to handle frequently changing parameters that makes them easier to maintain over time. Instead of leaving them buried down in the WHERE clause, which is usually near the bottom of the query and can be sandwiched between the FROM clause and the GROUP BY clause (if there is one), it is possible to move parameters to the very top of the query, even above the SELECT clause, by using variables. Using variables in a query is a three-step process:

First, define the names of the variables and what type of data they will contain.
Second, set the values of the variables.
Third, reference the variables in the query wherever you normally would use the explicit value.

The first step is to define the names and types of variable you are using. There is one naming convention you must follow. The name of the variable must begin with an @ symbol. After that, any combination of letters and numbers is allowed. The variable may  have any appropriate name. The datatype may be any valid datatype supported by SQL Server.  There is a list of them at http://msdn.microsoft.com/en-us/library/ms187752(SQL.90).aspx. The syntax for defining the names and types is as follows:

DECLARE @variable AS datatype [, @variable2 AS datatype ... , @variableN AS datatype];

You may declare as many variables as needed in one DECLARE statement. Note the semicolon at the end of the statement. It is usually optional, but it clarifies the end of a SQL statement.

After defining the variables, you set their values. Unlike the DECLARE satement, the SET statement may only affect one variable at a time:

SET @variable = value;

Again, the semicolon at the end is usually optional. However, if you are going to use a Common Table Expression to define a "view" before the main query, you must include the semicolon after the SET statement.

Finally, you use the variable in the query. Put the @variable reference anyplace where you would normally type in the specific value. In the future, you can update the value by changing the variable assignment in the SET statement instead of searching through the query for the explicit value.

Here is an example of using two variables, @StartDate and @EndDate to search for all the giving recorded in the CNHst table between two dates.

declare @StartDate as datetime, @EndDate as datetime
set @StartDate = '1/1/2010';
set @EndDate = '3/31/2020';

select
 C.NameCounter, sum(C.Amount) as TotalGiving
from
 Shelby.CNHst as C
where
 datediff(d, @StartDate, C.CNDate) >= 0 and datediff(d, C.CNDate, @EndDate) >= 0
group by
 C.NameCounter


Even in this simple example, you can see that it is easier to update the dates at the top of the query rather than updating them in the WHERE clause. Once you are comfortable using variables, you will find them immensely helpful in making your queries easy to update over time.

Followers