The age of Big Data is upon us, and the age of Pervasive (or “Ambient”) Data is rapidly approaching.
This means that not only the typical loosely-structured Big Data sources that we’ve become accustomed to considering, like Twitter feeds, Facebook posts, and web logs. This also means we’re dealing with the streams of data generated by the growing Internet of Things: a universe of sensors deployed in everything from refrigerators to factory robots, serving purposes from managing self-driving cars to automatically adjusting air conditioning in our homes.
More and more, the data streams generated by the various elements of our wired world are being pulled together to improve performance, to save money, to enhance profit, and to improve our lives in general. And more and more, as the effects of leveraging all forms of Big Data across a wide spectrum of industries are being felt, there is a stampede of organizations that want leverage it for all those purposes.
But amidst that stampede are thousands of already existing solutions, systems, and products already implemented and in place, being used for analytics and data warehousing. And most of those systems–regardless of platform or vendor–are not equipped to deal with the Volume, Velocity, and Variety of Big Data we are talking about.
Generally, the kind of processing demanded by Big Data takes a Hadoop installation, which may or may not be practical to either implement or maintain as either an add-on to these systems, or even if you consider starting from scratch.
But an ideal solution to this problem is Azure HDInsight, a 100% Apache Hadoop solution based entirely in the Cloud. With the scalability, security, and adaptability of the Azure environment, HDInsight can allow you to bring Big Data into an existing solution in a matter of days. The Azure environment is both easy to manage and cleanly integrated with the Microsoft server platform via PowerShell scripting. Further, HDInsight integrates natively with the rest of the Microsoft BI stack (including Office 365 Power BI plus SQL Server 2012 and 2014) using Hive and ODBC.
On the other end of the spectrum, for massive quantities of both traditional relational data AND Big Data capability in a single solution, the Microsoft Analytics Platform System is a turnkey answer. The APS consists of a SQL Server Parallel Data Warehouse (PDW) appliance, which optionally includes a private instance of HDInsight, as well as the breakthrough Polybase data querying technology which allows SQL queries to combine relational and Hadoop data together. The APS can:
- Scale out to 6PB (yes, that’s petabytes) of data
- Provide up to 100x performance gains over traditional data warehouse architectures
- Store and integrate traditional relational AND Big Data in the same appliance.
As a bonus, it’s also the cheapest option (per terabyte) of any data warehouse appliance available. 🙂
So as the drive to Big Data intensifies, Microsoft’s Cloud OS and other Data Platform offerings are positioned to help organizations of all sizes take advantage.