Introduction
My first blog in nearly 2 years with Perficient. I have been watching this space for a while and have been wondering for some time about potentials topics to blog. I have decided to initially focus my blogs on some interesting Business Analytics technologies which are discussed infrequently. ( Data mining, Predictive Analytics, Text mining, Data Discovery, Change Data Capture ,Disruptive & Complementary technologies and New Trends).
New Trends in Big Data
People who are following the BI blogs know the 3 Vs ( Velocity, Volume and Variety ) are the cornerstones of Big Data. The 3 Vs bring in newer technologies and some rehashing of older methodologies. Following are some of the trending Open source technologies that have come to limelight.
- Stream processing
- Data exploration with high volume and low latency
- Open source statistical programming
- In-memory analytics
I would be exploring the technologies mentioned above along with some smaller start-ups in my subsequent posts.
Stream processing – With the advent of stream processing many organizations want more real-time data for their business analytics. Many of the top ETL solutions have some limitations when it comes to real time data data processing. Many enterprises see the potential in using stream processing’s advantage over traditional ETL for BigData analysis.
Data exploration with high volume and low latency – Hadoop has been the go to technology when it comes to huge data exploration. But Google has already outdone Hadoop with their latest in-house technology. MapReduce was first introduced by Google and now they have introduced Dremel which can execute many queries over huge volumes of data that would ordinarily require a sequence of MapReduce jobs, but at a fraction of the execution time. Drill is similar to Dremel but Drill is an open source technology under Apache foundation. Both have the potential to replace Hadoop/MapReduce in a few years.
Open source statistical programming – Chances are you might have heard about R ( a statistical analysis tool ). There is a growing community of developers and data mining experts who agree R has the potential to outdo SSPS and SAS in a few years. The open source technology is being embedded into many BI solutions. The potential for new insights into a business with the use of R along with existing BI implementation is huge.
I believe in the future many businesses would take serious note of these trends and incorporate the above mentioned technologies as part of their core IT infrastructure.