Cloud

Overview of Azure Modern Data Warehouse

Overview of Modern Data warehouse:

 

Data warehouse nowadays will gather data from multiple data stores, including IOT, social networks, web APIs, files, and multiple corporate systems such as your CRM, HR application, and ERP systems.

 

A lot of value and insights can be discovered by cross referencing this data instead of keeping those in silos. For example, how would your sales increase after you started that new social network campaign.

 

Modern data warehouses must be able to read the data in various formats. The good old days of CSV, and maybe a couple of other formats are gone. Now you also have XML, Parquet, ORC, JSON and much, much more. You can even use cognitive services to extract tax from a recorded phone call or obtain metadata from pictures.

 

The data will also not come only through a batch or ATL process. It could be a live stream from sensors or stock market applications, allow you to react faster.

 

Finally, modern data warehouse solutions should be able to handle big data which is the term used to define large quantities of data collected in the escalating volumes at higher velocities and in a greater variety of formats than ever before.

 

Creating modern data warehouse solution is not a trivial task. Using various Azure solutions for data, this task becomes much easier.

 

Azure Solutions:

 

The phases relate to it are first, you need the data ingestion phase which is the process of capturing the data. This could be in many different formats and sources. A few Azure solutions you could use for data ingestion are Azure Data Factory, Stream Analytics or Event Hubs. It’s also quite common to have a staging layer for this data that allows you to temporarily hold the data that is coming at high speeds from various sources and batch that to be processed at a more convenient time. It’s also relatable to common, especially on ELT processes to just keep the data on the staging layer and have the analytical system just grab the data on the fly for further analysis.

 

If you want to keep the data on this raw format instead of sending to a final data warehouse solution, you can use Azure Data Lake. Then you must transform and process this data and model that into a format that is more convenient for the reporting. This might mean data cleansing, filtering, normalization or denormalization, conversional formats and so on. The main Azure tool for this end are Azure Data Factory and Data bricks.

 

Finally, it’s time to model and serve your data so that business intelligence analysts can generate reports and conclusions about it. The main tools for this are Azure (mumbles) Services and Azure Synapse Analytics. You can also add the visualization layer to it to create visually appealing reports for the business users and executives. Power BI is the tool of choice for this goal.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Mangai Murugan

More from this Author

Subscribe to the Weekly Blog Digest:

Sign Up
Categories
Follow Us
TwitterLinkedinFacebookYoutubeInstagram