March 2007 - Posts

Enable multi-phase data pump in DTS editor (Management Studio)

If you edit a legacy DTS package with SQL Server Management Studio, you may want to know how we can show the multi-phase data pump operations.

I haven't found an option to do this with SQL Server Management Studio. However, when you need to do something, you find a way to do that thing...

I found that necessary setting can be controlled through this registry key:

HKEY_CURRENT_USER\Software\Microsoft\Microsoft SQL Server\80\DTS\Settings




Cross-posted from SQLBlog! - http://www.sqlblog.com


SSAS 2005 SP2 breaks Excel calculated member selection

After many tries, today I came to the conclusion that Analysis Services 2005 SP2 breaks the Excel 2007 calculated member selection feature.

This is a scenario that works well:

  • Analysis Services 2005 SP1 + hotfixes (build 2153)
  • Excel 2007 RTM
  • Cube with several calculated members on a hierarchy or attribute (not on measures)
  • When you build a pivot table with Excel, you can freely select single calculated members from the hierarchy or attribute.

Adding Analysis Services 2005 SP2 even only on the server, you get all the calculated members of the hierarchy or attribute selected, and they are all always visible.

In my opinion, this is a breaking change of SP2. I don't see any reason for this behavior. Remember, is is suffice to update the server to break calculated member selection on any Excel 2007 client, regardless of the client component version you use.

I posted a bug here: please vote the bug to get an higher rating if you think (like me) that this requires a promptly response from Microsoft.




Cross-posted from SQLBlog! - http://www.sqlblog.com


How to get Distinct Count in SSIS

Almost one month ago I made a post about a brand new Distinct component for SSIS made by Alberto Ferrari and freely available on SQLBI.EU.

The most used way to get distinct rows in a Data Flow  that I've seen is to use the Sort component, asking only for distinct rows (there is a check box for that in the standard edit dialog box). The Sort component is a fully blocking one, because it requires to stop the flow until all the rows are read from the source. The Distinct component is not blocking, because it does not stop the flow and internally use hash functions and other structures to maintain a list of the alreay emitted rows. A previous post from Jamie Thomson describes pretty well pro and cons of the Sort component and raise the need for a better solution.

A couple of weeks ago, I realized that there is another option available. Using the Aggregate component, you can select all the fields you want in output (even all the columns) and select "Group By" as operations on them. Simply that. Not so intuitive to get a Distinct in such a way, but if you want performance... this seems the way to go.

From tests we made, using Aggregate component is the faster way to get a Distinct and it also doesn't consume as much memory as the Sort component. Unfortunately, also the Aggregate is a fully blocking component.

 Now we have three options to get a Distinct:

  • Aggregate component: it is standard, it is fast, it is fully blocking
  • Distinct component: it needs a separate install, it is fast but sometimes not as fast as the Aggregate, it is partially blocking
  • Sort component: it is standard, it is the slowest, it is fully blocking, it also sort the data

I think that in general we can use the Aggregate component, just because it does not require a separate deployment of the Distinct component. However, when the fully blocking behavior is not desirable, the Distinct component may worth the time to deploy/install another data flow component...




Cross-posted from SQLBlog! - http://www.sqlblog.com