There are certain aspects of data warehouse architecture (systems design) that the DW decision maker is solely responsible for, and will have to make on their own (as separated from other decisions that will be made by system conditions and the DW group). It is valuable to isolate these decisions because most data warehousing literature lumps too many different types of decisions together, while often omitting others. Also, the standard literature usually only lists 3 types of reporting and staging data store architecture (enterprise warehouse, and dependent and independent data mart options), when in reality, many more exist.
The first aspect of architecture we’ll explore is the most critical: Data consistency. This involves choosing the data sources, dimensions, business rules, semantics, and metrics the organization will make available to the users, and which they do not. What will need to be considered here is place of the data warehouse in the business. This decision is often affected by internal politics, which makes it that much harder. But, it is also fundamental to all other architectural decisions, and should therefore, be given the highest importance, because this type of architectural decision will have the most impact on the return of investment of the DW.
The main reasons for a data warehouse are for storing data, and being able to report against that data. It can also serve as a staging area, where data can be cleansed before possibly moving it to another location. So, it must be determined where the data is held to report against. Some of the options were mentioned above. What should be guarded against is the tendency for some to overlook the practical applicability of what is right for their particular environment, and instead focus on some concept of architectural “purity and beauty”. This is to be avoided.
The Future of Big Data
With some guidance, you can craft a data platform that is right for your organization’s needs and gets the most return from your data capital.
The Data modeling architecture choices will depend on whether denormalized, normalized, object–oriented, or proprietary multidimensional, etc,. data models are used. There is nothing wrong with using a variety of models, if they fit the organization’s needs.
Tool architecture is the choice of the tools to use for reporting and infrastructure.
Processing tiers architecture involves choosing what physical platforms will do which pieces of the concurrent processing that takes place when using a data warehouse. For some, this will be a simple host-based reporting system, while it may be far more complex for others.
Restrict access to the DW, possibly even down to the row or field level will usually require a Security architecture beyond what may be in the existing systems. This is another area that can be fraught with internal politics.
A final note on architecture is that business practices inevitably have to change in conjunction with a DW system implementation. That is another reason why the determination of the data consistency architecture is so important.
(The primary source for the above was the “Data Warehousing Information Center”.)