I just happened to hear about the NoSQL Now Conference taking place at San Jose through this weekend and thought it would be interesting to explore a bit about bridging NoSQL and BI.
NoSQL (‘Not Only SQL’) can be defined as the next-gen databases which differ from the traditional ones in being non-relational, distributed, open-source and horizontally scalable. In addition, they are also unique in the sense that they don’t require fixed table schemas.
Some of the popular NoSQL databases used include:-
Hadoop
Cassandra (Facebook, Twitter etc)
BigTable (Google)
Dynamo (Amazon)
Using NoSQL in BI Applications
There is no doubt that in order to have that competitive edge, companies need to make the best use of BI & analytics.
First let’s take a look at some of the problems with traditional databases and how NoSQL can make a difference:
- Ability to handle BIG DATA – With the increasing volumes of data being utilized by companies these days, the big challenge is how to store & retrieve it efficiently. You surely don’t want to leave out some vital data statistics from your analysis. This is where a NoSQL database is very good in handling massive volumes of data compared to the biggest RDBMS.
- Scalability – Whilst vertical scaling is possible in RDBMS, horizontal scaling is not. NoSQL helps in achieving horizontal scalability – that is the ability to distribute the load across multiple servers.
- Reduced administration – NoSQL databases are designed in such a way that the administration & maintenance effort required is minimal.
- Flexibility – How often do we come across situations where we need to add an extra column or some other minor change to a table which causes downtime due to the restricted nature of modeling in a RDBMS? In contrast, NoSQL has very few (if not none) modeling restrictions.
When we look at the above benefits, it is easy to jump and say that companies must make the switch to NoSQL. But we need to carefully consider other parameters before coming to a decision.
- Really big data? – Companies need to evaluate how much of big data they make use in their analysis (by big we mean like tracking the 25 million visits to your site). Unless you’re dealing with data of this degree which is really difficult to manage with a traditional database, there probably would not be a pressing need.
- Maturity – NoSQL is still in a maturing phase. While its popularity is growing every day, enterprises will find it hard to leave a mature system like the RDBMS. Whilst firms which indeed make use of big data, will want to take a risk, other firms will still be quite circumspect.
- Connectivity to NoSQL – There are not many BI tools out there that provide connectivity with NoSQL databases. Companies will not want to make a switch from a particular tool without more options of leveraging with NoSQL.
- Expertise – Finding resources to manage / maintain the applications which use RDBMS is much easier than finding someone skilled in NoSQL.
Conclusion
Along with its numerous benefits, right now there are inherent problems which need to be addressed in order to make more firms use NoSQL. That surely does not mean companies must get rid of their existing RDBMS and DBA’s. NoSQL if used in parallel, can provide that edge over your competitors. Hopefully the big BI players see this and soon leverage their tools with the ability to connect to NoSQL databases. Ofcourse with the developing interest of students in this area, the skillset of resources is bound to increase in the coming years!
This is an awesome post and very timely. Right now I think everyone’s sorta playing around and feeling things out. I really love NO SQL as a basis for transactional systems. But the BI sides of it has been largely unexplored. No doubt that the BI space will be vastly different in as little as five years. Really great write up.
Thanks a lot Ron! Yeah, I had not explored much about NoSQL till last week & was quite fascinated with some of the impacts it can make. Surely there’s a huge potential for BI to integrate with NoSQL in the coming years & offer clients more insights into their data.
An other parameter to consider is what type of NoSQL solution you will implement. Cause while “NoSQL solutions” can solve a lot of problems, one particular NoSQL solution is good only at solving a few of them. And that’s a fact newcomers in the NoSQL field take a few weeks to realy fully understand : many NoSQL vendor don’t share anything besides not relying on SQL based statements.