Thomas Pagel BI Blog

Microsoft BI Technology & more

<October 2008>
SuMoTuWeThFrSa
2829301234
567891011
12131415161718
19202122232425
2627282930311
2345678


Navigation

SSIS Blogs

SSRS Blogs

SSAS Blogs

Other SQL Server Links

Subscriptions

Post Categories



SSIS Snippet: Dimension and Fact load with Surrogate Keys and Early Arriving Facts

Well, what a headline... So many BI buzz words, wow, what a solution this has to be... It's nothing special, not very trivial, but also not very hard to understand... at least I hope so...

From a post in the Microsoft Technical Forums for SSIS (http://forums.microsoft.com/msdn/ShowPost.aspx?PostID=75211) I got the idea to write my second snippet which shows how to create new dimension records from a fact table (so you get facts for a dimension record which doesn't exist, yet) and handles all the surrogate key stuff you need...

So what does this package do? It's based on Marco's post http://sqljunkies.com/WebLog/sqlbi/archive/2005/05/30/15684.aspx about how to work with surrogate keys. This is a nice example for loading dimensions and assigning surrogate keys on the fly.

I took this and placed some transforms around it (see a screenshot here: http://tpagel.gmxhome.de/snippets/keygen.jpg). There is a source and I assign the surrogate keys by doing a "merge join" with the dimension table. The problem is now that the dimension table might have some records missing. The solution is that I split up the source by a "multicast". The one side only does a "sort" to prepare for the "merge join". The other path is aggregated by the application key. I lookup the dimension table to find out the dimension key for these application keys. If that fails I generate a new surrogate key (thanks to Marco's code) and add the record with this new key to the dimension table. At the same time ("multicast" again...) I join this dimension record (with a "union all") with the already existing records from the lookup.

So now I have both, the existing records who got their surrogate key from the dimension table and the new dimension records just created. I sort them by the application key and then I can merge them with the facts using an "inner join".

The result is an updated dimension table with new records (including surrogate keys) from the fact table and the fact table itself with the right surrogate keys assigned.

I hope that this helps people to understand SSIS a little better. I'm quite sure that this is no perfect design but it works...

You can download the package (including the backup of a very simple sample database) here: http://tpagel.gmxhome.de/snippets/keygen.zip. Just add a few records to the "factSource" table, run the package and see what happens to the "dim" and "factDestination" table...

Comments are (as always) very appreciated...

 

Thomas

posted on Friday, August 19, 2005 11:45 AM by tpagel


# SSIS: On my whish list: A dynamic lookup transform @ Thursday, January 05, 2006 10:17 AM

I just issued a suggestion for the next SSIS version… I decided to make it public here just to ensure...

Anonymous




Powered by Dot Net Junkies, by Telligent Systems