Like many organizations, we have many different repositories for data within our organization. Some of our data sits in SQL Server for applications like Microsoft Dynamics SL and Microsoft CRM. Most of our other data sits in SharePoint lists which we use regularly for both internal and external collaboration.
From time to time, we need to mash some of this data for various analysis purposes. Many times, we use tools like the BDC to import data into SharePoint but we recently had a case where we needed to get data out of SharePoint into native SQL tables. We could have done a one time export to Excel and/or Access but we needed an approach that would allow us to do scheduled exports of the data without manual intervention.
For this task, we chose what seemed like the most appropriate Microsoft tool – SQL Server Integration Services. I started down the path of using a Web Service Task only to find that the SSIS Web Service Task won’t work with SharePoint lists so it was time to look for a ‘creative’ solution. The approach ended up being pretty simple after figuring out how to get around the SSIS / SharePoint Web Service limitation.
Extract the data using the SharePoint Web Service
First, you need to create a proxy using Visual Studio. You can find a good reference for this task here.
This proxy needs to be saved to two locations:
- The GAC: C:Windowsassembly
- The Microsoft .NET 2.0 folder: C:WindowsMicrosoft.NETFrameworkv2.0.50727
Now, go ahead and create your SSIS project and add a Script Task. Open up the Script Task and click Design Script. The script will need the following references:
- System.XML
- System.Web.Services
- SharePointProxy (your custom class)
From there, you can go ahead and create your Script. Mine looks like the following:
|
We use SPProxy.Credentials to use the credentials from the current user. You may want to specify a service account or different set of credentials depending on your environment and security configuration. In addition, make sure that the account that you’re running under has rights to write to the file system location that you specify.
Transform the data
Now that we have our sharepoint_list.xml file, we need to make it usable. Create an XML Task and configure it for XSLT. In my case, I supply sharepoint_list-convert.xml as the destination file for the operation and set OverwriteDestination to True so that I can run this package again in the future without needing to delete the prior files.
The XML from the original sharepoint_list file won’t work all that well since it has multiple namespaces so we need to address this. I found a great blog written by Ashvini Sharma that shows how to remove the 2nd namespace from the XML.
Go ahead and add the following as the SecondOperand:
|
Now that we have some clean XML, let’s take finish the SSIS project and import it into SQL Server table.
Load the data
Since I’m going to re-run this as a scheduled task, I want to make sure that my destination table is cleared up so I add a SQL Task with the following script:
|
Now we’re ready so add a Data Flow Task to the SSIS designer. Open up the task and add an XML Source and OLE DB Destination.
Within the XML Source, specify that you want the Output name ‘row’ (rather than data). Then, go ahead and specify which columns you want to include and given them a friendly name by removing the ‘ows_’ and ‘x0020’ from the column names. Go ahead and connect those columns to your destination columns in your SQL table and execute the package.
In my case, I have about 900 items (rows) of data in SharePoint and the package runs from start to finish in about 14 seconds.
I hope that you find this information helpful. Special thanks to Michael Becker and Jeff Monnette for their contributions to this project.