OLAP and Hadoop are not the same. OLAP is a technology to perform multi-dimensional analytics like reporting and data mining. It has been around since 1970. Hadoop is a technology to perform massive computation on large data. Around since 2002. They can be used together but there are differences when choosing between using Hadoop/MapReduce data processing versus classic OLAP. For this chat, let’s avoid the concern of price and also assume the business needs have been thought through.
1 Processing Type
For transactions and data mining use OLAP. But, for analytics and data discovery use Hadoop. For known cleaned data/processes that yield definitive results of high integrity use OLAP. For unknown messier data/processes that yield suggestive results use Hadoop. E.g., use OLAP for weather sensors, but Hadoop for weather models. OLAP can perform fast reads on high-end servers. Hadoop can perform fast reads and writes on distributed services.
2 Data Size
OLAP is meant to operate on pre-aggregated data from a massive number of records. It has good throughput of more records in a data warehouse. Hadoop is meant to operate on massive un-aggregated data from a lower number of objects. It has high throughput of larger objects in a data lake (Harris, n.d.). Does the business need more of smaller objects or less of larger objects? For example, if summing records is important, then OLAP is good. But, if audio analysis is important, then Hadoop is good. Overall, Hadoop has superior throughput.
3 Interaction
OLAP runs on SQL following DB normalization principles. Hadoop runs on HQL following object-oriented concepts. SQL is based on a relational DB model. But, HQL combines object-oriented programming with relational DB concepts (Jeyakanth, 2017). OLAP is good for update, insert, select, and delete. Hadoop is good for any other manner of object.
4 Data Structure
OLAP is meant for structured dimensional model. It scales well vertically. OLAP likes more of same things in a relational table. Whereas, Hadoop is meant for unstructured data and scales well horizontally. Hadoop likes more of different things with key/value pairs. Thus, the sources of data is important consideration. For example, OLAP for more police ticket transactions and Hadoop for more body cam data. Overall, Hadoop will be better on the max total storage needs.
Conclusion, OLAP and Hadoop
In most cases, Hadoop can do what OLAP does. OLAP might be needed if there is a legacy system to consider. Or you only need reporting. Or tech maturity is a driver. However, generally, I lean toward Hadoop/MapReduce over OLAP.
For more information: