Skip to main content

Data & Intelligence

The Modern Data Warehouse Will Augment Hadoop

The data warehouse has been a part of the EIM vernacular for nearly 20 years. The vision of the single source of the truth and a single repository for reporting and analysis are two objectives that have resulted in a never-ending journey.   The data warehouse never has had enough data and the quality required for a single version of the truth demands significant investment that only rare business cases could support. Further, the role of the analytical database has generally been difficult to achieve. Ad-hoc analysis on large sets of complex data has generally been a significant challenge for the traditional data warehouse. Historically, to address this, companies have implemented appliances, analytical data marts, or a varying set of database features and compromises (think bit mapped indexing, a variety of hardware and software caching techniques, indexed stored data to name a few).   All with significant investment and usually adding significant overhead.  

However, there is hope: Big Data, aka, Hadoop on commodity class hardware. This low(er) cost and highly scalable environment can help to solve a number of challenges. First, the storage of almost 100 times more data is possible for a similar hardware investment. Next, analytical processing can be offloaded from the data warehouses. The massively scalable hardware truly enables adhoc, big data analytics, a role in which traditional data warehouses have struggled to satisfy.

Lastly, if one moves the function of storing atomic level data and the analysis overhead from the data warehouse, do we still have the need for a data warehouse?   We need to have something to handle the complexities around reporting, structured analysis (e.g. OLAP) and traditional BI. Clearly, the data warehouse as we know it, will change significantly. With traditional data warehouse roles of storing atomic data and serving the up analytics transitioning to Hadoop environments do data warehouses become optimized data and reporting marts?   Time will tell, but one thing is clear. Hadoop does not augment data warehouses; data warehouses will now augment Hadoop.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Bill Busch

Bill is a Director and Senior Data Strategist leading Perficient's Big Data Team. Over his 27 years of professional experience he has helped organizations transform their data management, analytics, and governance tools and practices. As a veteran in analytics, Big Data, data architecture and information governance, he advises executives and enterprise architects on the latest pragmatic information management strategies. He is keenly aware of how to advise and lead companies through developing data strategies, formulating actionable roadmaps, and delivering high-impact solutions. As one of Perficient’s prime thought leaders for Big Data, he provides the visionary direction for Perficient’s Big Data capability development and has led many of our clients largest Data and Cloud transformation programs. Bill is an active blogger and can be followed on Twitter @bigdata73.

More from this Author

Follow Us