Microsoft Cloud BI: Azure Data Services for BI?

In my first post in this series, I talked a little about the basics of cloud BI and Microsoft’s Windows Azure public cloud platform.  I gave a brief glimpse of what services Azure offers for BI.   So to begin with, let’s recap the Azure data services offerings in terms of the cloud computing models available.  Azure’s Platform as a Service (PaaS) offerings include Azure SQL Database, Azure SQL Reporting Services, HDInsight, and Azure Tables.  Azure’s Infrastructure as a Service (IaaS) offering is Azure Virtual Machines.  
 So what does each of these services provide, and is it something you can leverage for BI? 
 Azure SQL Database
The Service Formerly Known As “SQL Azure” is essentially “SQL Server in the cloud”.   With your subscription, you can establish up to 150 databases of either 5GB each for the Web Edition subscription (up to 150GB each for the Business Edition).  Aside from some administration differences and some datatypes in the on-premise software not supported in the cloud service, Azure SQL Database functions very much like SQL Server.  You access data with T-SQL and write queries the same way.  It’s a great option for building an application data source — especially if that application is also being built in Azure.  But what about for BI use?  The capacity and capabilities of SQL Database are sufficient for almost any small company’s data warehouse needs — even that of many mid-size organizations.  But this is not the place to store a multi-TB data warehouse.   One could theoretically Federate  databases to achieve that, but complexity and storage costs would be prohibitive.  Connectivity and bandwidth concerns may also render typical ETL patterns impractical.  And, as if that weren’t enough, Azure SQL Database does not include the BI stack tools: Analysis Services, Integration Services, or Reporting Services. 
 Azure SQL Reporting Services
This service is, as you might guess, basically a port of SQL Server Reporting Services into the Azure world as a PaaS offering.  As such, the development experience and functionality are very much like that of the document-oriented interface of traditional RS.  So this makes up for not having SSRS in SQL Database right?    Well, no.  Azure RS is really built for use only against a SQL Database data source.  And as far as security, SQL Authentication is the only scheme supported.  So, no, this isn’t a cloud-based reporting service at-large, and isn’t a general purpose BI tool.
 Azure Tables
Azure Table storage provides storage primarily aimed at non-relational data.  This “No SQL” option is ideal for storing big (up to 100TB with an Azure Storage account) datasets that don’t require complex joins or referential integrity.  Data is accessible in tables using oData connections and LINQ in .NET applications.  The term “Tables” is a little deceiving, because an Azure Table is actually a collection of Entities, each of which has Properties.  You can think of this as analogous to a Row of Columns — even though it’s quite different under the covers.  So, this type of storage is of great use in app development for storing, say User or Profile  information.  It’s less expensive than Azure, scales easily and transparently for both performance and data size.  BUT it does not make for a great BI platform , as things like RI and joins tend to be fairly critical parts of a Data Warehouse. 
A cloud-based offering for storing Big Data in the cloud, HDInsight is built on Hadoop, but offers the capability of end users to tap into big data using SQL Server and/or Excel.   HDInsight uses Azure Storage , which provides the ability to contain TB-level stores of unstructured data.  But the real magic happens in supplementary tools like Hive, Pig, and Sqoop,  which allow users to submit queries and return results from Hadoop using SQL.  One could connect to an HDInsight store using on-premise SQL Server, Azure SQL Database, or even Excel.   So here, we have something extremely compelling from the perspective of analytics. 
 BTW – This is one way in the Microsoft stack to integrate Big Data into existing solutions.  The other way, staying native to Microsoft tech, would be using Parallel Data Warehouse (PDW) 2012, which features cool new Polybase technology as a way to bridge the gap between Hadoop and SQL.  But that’s an entirely different topic…
 Azure Virtual Machines
So, we’ve evaluated the PaaS side and found it a questionable fit for BI.  So what about the IaaS side and Azure Virtual Machines?  NOW we’re talking!  Azure VM’s provide the only path to running full-featured SQL Server BI in the cloud, since you can install and run a complete version of SQL server.
But since this post is already too long,  so I’ll save that for next time….  See you then!

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Andrew Tegethoff

Andy leads Perficient's Microsoft BI team. He has 16 years of IT and software experience with a primary focus on Enterprise Information Management solutions using the Microsoft Data Platform.

More from this Author

Subscribe to the Weekly Blog Digest:

Sign Up
Follow Us