I remember several years ago when I was working with a company’s CIO. His comment to me was, “Why do I need to move any data or create a data warehouse? Why can’t I just virtualize all of my operational data from its source and use it where it is? That way, I get real-time results and a lot of flexibility when our business requirements change.”
This may be a typical statement / question from a business executive and I’m sure most of you have heard something like this before. Why can’t we just virtualize all of our operational data through integration rules and business logic? We may be able to, but it wouldn’t be the right answer. The right answer is, it depends on the use case.
Forrester view of Data Virtualization (also known as Information-as-a-Service) is “Data virtualization has many use cases including providing a single version of the truth; enabling real-time business intelligence (BI), enterprise wide search, or high-performance scalable transaction processing; exposing big-data analytics; federating views across multiple domains; improving security and access; integrating with cloud and partner data and social media; as well as delivering information to mobile apps”.
Data virtualization is making data look like it is really there physically, when in reality, it is just code in the form of integration rules, data quality rules, formatting rules, match/merge rules, etc….
Here are some examples where virtualizing data may be warranted:
- Business needs to instantly combine & access fresh & accurate data
- Complete view of the business is required on-demand
- Business needs to be agile to innovate
- High-value operational reporting is critical
- Complex and heterogeneous IT environment
- Data Mart proliferation
According to Forrester, in their “Wave of Data Virtualization, Q1 2012, Informatica is the leading provider of the data virtualization solution. Their solution is called Data Services and it was built around the same concepts of its data integration solution (PowerCenter) and their data quality solution.
What differentiates their solution from some of the others is that data integration rules and data quality rules can be built in their Data Services (virtualization) solution and then used in their data integration solution (PowerCenter).
So what this means is that a company can leverage the benefits of data virtualization for rapid development and deployment of a virtual data integration solution and if or when it becomes a solution that requires instantiation of a physical data structure (e.g. Data Warehouse), it can easily be done without code rewrite. Basically the rules that have been developed with data virtualization are reused for population of a physical data structure.
Informatica Data Services provides a single environment for data integration and data federation along with role-based tools that share common metadata. It allows analysts to access and merge data directly across systems and to collaborate with IT to create sophisticated business rules that leverage the data profiling, complex transformations, data quality, and data masking capabilities of the Informatica platform. With Informatica Data Services, a company benefits from a single scalable architecture for both data integration and data federation, creating a data virtualization layer that hides and handles the complexity of accessing underlying data sources—all while insulating them from change. As a result, analysts get the data they need and trust while IT retains control of the process. IT can deploy data services that can be instantly reused for all applications without rework.
Informatica Data Services offers on-the-fly data quality and profiling, a model-driven approach to provisioning data services, performance enhancements, cloud integration, common metadata, and role-specific tools.
Forrester Wave of Data Virtualization
Hi, I have a use case, I need to create an output virtual table using informatica data virtualization, this virtual table will be consumed by a reporting tool to create a SSRS report. There will be two input file from different sources input1 will be from SQL table and input2 will be from Hadoop Hive talbe and I want a third table as an output vitual table which will be having the union of these input tables. And this virtual table will be consumed to create a SSRS report.
I am new to Informatica, I searched a lot to find the way to perform this data virtualization, but did not got any suitable guide to do this.
Please help me to achieve this.