Wednesday, August 20, 2008

Building Packages Programmatically...

...just got easier.

One of the most common SSIS questions that most consistently gets the most consistently frustrating answers is "how do I build a package that can load data from an Excel spreadsheet into a database table when I don't know the layout of the spreadsheet until run time?"

No, the question is never phrased quite this succinctly, but there are dozens of variations that all boil down to this core. The source may be a text file or a database table and not an Excel spreadsheet, but the "I don't know the schema until I run the package" or "the source columns map directly to the destination columns, but I don't know exactly what they are" aspects of the question remain the same.

And the answer remains the same too: "SSIS doesn't do 'dynamic' data flow. You can work around this limitation by using the SSIS .NET API to dynamically build and execute a package, reading in the metadata about your source and destination columns to construct the data flow."

The problem with this answer is threefold:

  1. The API for working with the SSIS data flow is a bit complex.
  2. The documentation on the SSIS data flow API is a bit sparse.
  3. The samples available that demonstrate this technique are a bit nonexistent.

This triumvirate of frustration has been a pet peeve of mine for quite some time. In fact, I've been known to say that such a frustrating answer to such a common question is a major barrier to the adoption of SSIS.

But this week the story just got better. The SSIS team has released additional functionality (see Matt Masson's blog post for more details on the new functionality, including SharePoint List Adapters!) as part of their SSIS Community Samples project on CodePlex. As part of the new release, the samples now include a package generation application that demonstrates how to solve this archetypal problem. Here's the intro blurb from the readme file for this sample:

"This sample demonstrates package generation and execution using the Integration Services object model. The sample can be used to transfer data between a pair of source and destination components at the command line.


The sample supports three types of source and destination components: SQL Server, Excel and flat file. You can choose to create a new destination based on the source component metadata. Alternatively, this sample supports mapping source and existing destination columns by using the same column names or manually, by using the command line.

You can modify the code in this sample to fit your own application."

This may not sound exciting when you read it, but it almost takes my breath away. Digging into this sample is high on my to-do list in the weeks ahead (July and August have been crazy months for me, as the lack of activity on my blog demonstrates) and if you have ever been faced with this "dynamic data flow" conundrum, you should check it out as well.

No comments: