BI (RSS)

SSIS BUG - Migrate DTS 2000 Package fails when pulling packages with trailing blanks

So today I was testing out the SSIS "Migrate DTS 2000 Package" wizard.  I went through the following steps:

1.  In BIDS, I right-clicked the SSIS Packages folder in the Solution Explorer.

2.  Selected Next at the Wizard intro.

3.  Selected the source location (where I'll be pulling the packages from).

4.  Selected the destination location (the folder where the converted packages will reside).

On the next dialog box I'm supposed to see a list of DTS 2000 packages on the source SQL Server instance, but instead I get the error:

"Index was out of range. Must be non-negative and less than the size of the collection.

Parameter name: index (mscorlib)"

After digging through the newsgroups, I found a tip that this was caused by trailing blanks in the DTS package names.  I ran the following query to identify the offending packages:

SELECT DISTINCT name
FROM msdb.dbo.sysdtspackages
WHERE name LIKE '% '

After renaming the packages (removing trailing blanks) - the List Packages dialog box worked!  Thanks Koni Kogan!

Loading legacy data? Use control files...

If your data warehouse or data mart is populated from external legacy files, be sure to request a control file too.  By control file, I mean a file that contains the name of files they plan on sending and their associated row counts.

FTP processes can be interrupted.  Files get corrupted.  Using a control file adds an extra layer of data integrity and protection, as you can build in logic to ensure that what you have received is what you should have received.  If the row counts don't match, you can short-circuit your load process rather than load incomplete or partial result sets.

Ever try to back out bad or partial data from a big data warehouse?  It isn't fun.  Use a control file to prevent the partial or bad data from entering your tables in the first place.

The process I use, at a general level, is as follows:

1. I create a DTS package that uses an ActiveX script to check for the arrival of a "new" control file.  If the file isn't there, the ActiveX script returns an error.  This package is scheduled in a SQL Server Agent job step, which will "on failure", keep re-trying until the control file arrives. I have the retry interval set to every minute - because the file arrival times vary - and I wan't to process files as soon as possible.

2. Once the file arrives, the SQL Server Agent job kicks off the consecutive packages. These packages load the individual files into the staging database (including the control file).

3. After the data is loaded, I then use a  stored procedure to check the row counts of the SQL Server staging tables versus the control file.  If there are issues, I raise an error and none of the data is processed.  If no issues, however,  I continue on with loading the dimension and fact table data.