what is ETL?
ETL is a process that extracts the data from different source systems, then transforms the data (like applying calculations, concatenations, etc.), and finally loads the data into the Data Warehouse system. The full form of ETL is Extract, Transform, and Load.
What is a data warehouse (DW)?
A Data Warehouse (DW) is a relational database that is designed for query and analysis rather than transaction processing. It includes historical data derived from transaction data from single and multiple sources. The purpose of a data warehouse is to connect and analyze business data from heterogeneous sources. Data warehouses are at the core of business intelligence solutions, which analyze and report on data.
Different types of ETL Tools available?
There are multiple ETL tools available in the market such as Adeptia Connect, Alooma Platform, CData Driver Technologies, Fivetran, IBM InfoSphere Information Server, Informatica Intelligent Data Platform, Matillion ETL, SQL Server Integration Services (SSIS), Oracle Data Integration Cloud Service, Talend Open Studio, SAS Data Management, and others.
In this blog, we will be discussing Informatica PowerCenter:
What is Informatica?
Informatica is a data integration tool based on ETL architecture. Data integration across business applications, data warehousing, and business intelligence are its main applications. Informatica has bust-in functionalities to connect with various source systems like databases, file systems, or SaaS-based applications using configurations, adapters and in-built connectors. Data is extracted from all systems through Informatica, transformed on the server, and fed into data warehouses.
Example: It is possible to connect to many database servers; Oracle and Microsoft SQL Server databases are both connected here. The data can be combined with that of another system.
Why We Need Informatica?
- If we need to perform some operations on the data at the backend in a data system, then we need the Informatica.
- To modify, update, or clean up the data based on some set of rules, we need Informatica.
- By using Informatica, is accessible for the loading of bulk data from one system to another.
Components in Informatica.
Informatica consists of two types of components:
- Server component: Repository service, integration service, Domain, Node
- Client component: Designer, workflow, monitor, repository client
Informatica ETL tool has the below services/components, such as:
- Repository Service: It is responsible for maintaining Informatica metadata and provides access to the same to other services. The PowerCenter Repository Service manages connections to the PowerCenter repository from repository clients. The Repository Service is a separate, multi-threaded process that retrieves, inserts, and updates metadata in the repository database tables. The Repository Service ensures the consistency of metadata in the repository.
- Integration Service: This service helps in the movement of data from sources to targets. The PowerCenter Integration Service reads workflow information from the repository. The Integration Service connects to the repository through the Repository Service to fetch metadata from the repository. The Integration Service can combine data from different platforms and source types. For example, you can join data from a flat file and an Oracle source. The Integration Service can also load data to different platforms and target types.
- Reporting Service: This service generates the reports. After you create a Reporting Service, you can configure it. Use the Administrator tool to view or edit the Reporting Service properties. To view and update properties, select the Reporting Service in the Navigator. In the Properties view, click Edit in the properties section that you want to edit. If you update any of the properties, restart the Reporting Service for the modifications to take effect.
- Repository Manager: Repository Manager is a GUI-based administrative client component, which allows users to create new domains and used to organize the metadata stored in the Repository. The metadata in the repository is organized in folders, and the user can navigate through multiple folders and repositories as shown in the image below.
- Informatica Designer: Informatica PowerCenter Designer is a graphical user interface (GUI) for creating and managing PowerCenter objects such as source, target, Mapplets, Mapping, and transformations. To develop ETL applications, it provides a set of tools known as “Mapping”. PowerCenter Designer creates mappings by importing source tables from the database with the Source analyzer, target tables from the database with the Target designer, and transforming these tables.
- Workflow Manager: The Workflow Manager allows for the creation and completion of workflows and other tasks. You must first create tasks, such as a session containing the mapping you make in the Designer before you can establish a process. You then connect tasks with conditional links to specify the order of execution for the tasks you created.
The Workflow Manager consists of three tools to help you develop a workflow:
- Task Developer – Use the Task Developer to create tasks you want to run in the workflow.
- Workflow Designer – Use the Workflow Designer to create a workflow by connecting tasks with links. You can also create tasks in the Workflow Designer as you develop the workflow.
- Worklet Designer – Use the Worklet Designer to create a worklet. ￼
- Workflow Monitor: Informatica Workflow Monitor makes it easy to track how tasks are being completed. Generally, Informatica Power Centre helps you to track or monitor the Event Log information, list of executed Workflow, and their execution time in detail.
The Workflow Monitor consists of following windows:
- Navigator window – Displays monitored repositories, servers, and repositories objects.
- Output window – Displays messages from the Integration Service and Repository Service.
- Time window – Displays the progress of workflow runs.
- Gantt Chart view – Displays details about workflow runs in chronological format.
- Task view – Displays details about workflow runs in a report format.
So, in Part 1 we have seen an overview of Informatica PowerCentre and basic understanding of all the available tools, in the next blog we will be discussing about the various transformations available in Informatica PowerCentre.
Please share your thoughts and suggestions in the space below, and I’ll do my best to respond to all of them as time allows.
for more such blogs click here