Skip to main content

Cloud

To SSIS Or Not To SSIS

Microsoft SQL Server Integration Services has some great features and is extremely useful for cleansing and importing data into SQL Server, however; not every situation warrants its use. One instance where I found SSIS to be less useful was when importing a lot (just under 100) of XML source files into a SQL data warehouse. For this approach, it was far less time consuming to use the T-SQL Bulk Insert command for XML Data over the Data Flow task with a XML file source in SSIS.
The first thing is to create a table which will hold the imported XML file:
CREATE TABLE tblXmlImport
(

data XML NOT NULL

)
GO
Next, the command to import the XML file into the table looks like this:
DECLARE @File VARCHAR(300)
SELECT @File = ‘C:Tempbook.xml’
EXEC
(

INSERT INTO tblXmlImport(data)

SELECT xmlData

FROM

(

SELECT *

FROM OPENROWSET (BULK ”’ + @File + ”’ , SINGLE_BLOB) AS XMLDATA

) AS FileImport (XMLDATA)

)
GO
Finally, to parse the XML data and extract it to the destination table is a simple SELECT statement:
DECLARE @xml XML
SELECT @xml = xml_data
FROM XmlImportTest
SELECT

Tbl.Col.value(‘@bookkey’, ‘INT’),

Tbl.Col.value(‘@title’,‘varchar(255)’),

Tbl.Col.value(‘@subtitle’,‘varchar(255)’),

Tbl.Col.value(‘@shorttitle’,‘varchar(50)’),

Tbl.Col.value(‘@creationdate’,‘datetime’),

Tbl.Col.value(‘@sku’,‘varchar(10)’),

FROM @xml.nodes(‘//book’) Tbl(Col)
To add this method to the existing ETL process and import the files on a regular basis only requires one Stored Procedure where the dynamic setting of the @File variable can be achieved by querying the file system folder that contains the source XML files. Furthermore, the SELECT statement to parse the XML source files and insert into the destination table can be dynamically generated by querying the system tables and retrieving the column names of the destination table. Now, trying to perform the same operation with SSIS and the BIDS interface would be extremely time consuming. The dragging and dropping of the Data Flow tasks, source and destination objects, mapping of the columns and trying to manage all of those connections is an administrative nightmare. With a little bit of thought and a little more code, the same end result is achieved in far less time.
Business Intelligence Consulting

Technorati Tags: ,,,Business Intelligence Consulting,

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Brian Ringley

More from this Author

Follow Us