Skip to main content

Cloud

Exporting Data from SharePoint 2007 Lists to SQL Server via SSIS

Like many organizations, we have many different repositories for data within our organization. Some of our data sits in SQL Server for applications like Microsoft Dynamics SL and Microsoft CRM. Most of our other data sits in SharePoint lists which we use regularly for both internal and external collaboration.

From time to time, we need to mash some of this data for various analysis purposes. Many times, we use tools like the BDC to import data into SharePoint but we recently had a case where we needed to get data out of SharePoint into native SQL tables. We could have done a one time export to Excel and/or Access but we needed an approach that would allow us to do scheduled exports of the data without manual intervention.

For this task, we chose what seemed like the most appropriate Microsoft tool – SQL Server Integration Services. I started down the path of using a Web Service Task only to find that the SSIS Web Service Task won’t work with SharePoint lists so it was time to look for a ‘creative’ solution. The approach ended up being pretty simple after figuring out how to get around the SSIS / SharePoint Web Service limitation.

Extract the data using the SharePoint Web Service

First, you need to create a proxy using Visual Studio. You can find a good reference for this task here.

This proxy needs to be saved to two locations:

  • The GAC: C:Windowsassembly
  • The Microsoft .NET 2.0 folder: C:WindowsMicrosoft.NETFrameworkv2.0.50727

Now, go ahead and create your SSIS project and add a Script Task. Open up the Script Task and click Design Script. The script will need the following references:

  • System.XML
  • System.Web.Services
  • SharePointProxy (your custom class)

From there, you can go ahead and create your Script. Mine looks like the following:

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime

Public Class ScriptMain

Public Sub Main()
Dim SPProxy As New SharePointProxy.Lists()
SPProxy.Credentials = System.Net.CredentialCache.DefaultCredentials
SPProxy.Url =
http://URL/PATH/_vti_bin/lists.asmx
Dim Node As Xml.XmlNode = SPProxy.GetListItems("List", Nothing, Nothing, Nothing, Nothing, Nothing, Nothing)
Dim writer As IO.StreamWriter = New IO.StreamWriter("c:ssis_execsharepoint_list.xml", False)
writer.Write(Node.OuterXml)
writer.Close()
Dts.TaskResult = Dts.Results.Success
End Sub

End Class

We use SPProxy.Credentials to use the credentials from the current user. You may want to specify a service account or different set of credentials depending on your environment and security configuration. In addition, make sure that the account that you’re running under has rights to write to the file system location that you specify.

Transform the data

Now that we have our sharepoint_list.xml file, we need to make it usable. Create an XML Task and configure it for XSLT. In my case, I supply sharepoint_list-convert.xml as the destination file for the operation and set OverwriteDestination to True so that I can run this package again in the future without needing to delete the prior files.

image

The XML from the original sharepoint_list file won’t work all that well since it has multiple namespaces so we need to address this. I found a great blog written by Ashvini Sharma that shows how to remove the 2nd namespace from the XML.

Go ahead and add the following as the SecondOperand:

<?xml version="1.0" encoding="utf-8" ?>

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" indent="no" />

<xsl:template match="/|comment()|processing-instruction()">

<xsl:copy>

<xsl:apply-templates />

</xsl:copy>

</xsl:template>

<xsl:template match="*">

<xsl:element name="{local-name()}">

<xsl:apply-templates select="@*|node()" />

</xsl:element>

</xsl:template>

<xsl:template match="@*">

<xsl:attribute name="{local-name()}">

<xsl:value-of select="." />

</xsl:attribute>

</xsl:template>

</xsl:stylesheet>

Now that we have some clean XML, let’s take finish the SSIS project and import it into SQL Server table.

Load the data

Since I’m going to re-run this as a scheduled task, I want to make sure that my destination table is cleared up so I add a SQL Task with the following script:

IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[sharepoint_table]’) AND type in (N’U’))
TRUNCATE TABLE [dbo].[sharepoint_table]

Now we’re ready so add a Data Flow Task to the SSIS designer. Open up the task and add an XML Source and OLE DB Destination.

image

Within the XML Source, specify that you want the Output name ‘row’ (rather than data). Then, go ahead and specify which columns you want to include and given them a friendly name by removing the ‘ows_’ and ‘x0020’ from the column names. Go ahead and connect those columns to your destination columns in your SQL table and execute the package.

In my case, I have about 900 items (rows) of data in SharePoint and the package runs from start to finish in about 14 seconds.

image

I hope that you find this information helpful. Special thanks to Michael Becker and Jeff Monnette for their contributions to this project.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Ethan Matyas

More from this Author

Follow Us