Informatica Articles / Blogs / Perficient https://blogs.perficient.com/tag/informatica/ Expert Digital Insights Fri, 15 Nov 2024 18:30:24 +0000 en-US hourly 1 https://blogs.perficient.com/files/favicon-194x194-1-150x150.png Informatica Articles / Blogs / Perficient https://blogs.perficient.com/tag/informatica/ 32 32 30508587 A Comprehensive Guide to IDMC Metadata Extraction in Table Format https://blogs.perficient.com/2024/11/16/a-comprehensive-guide-to-idmc-metadata-extraction-in-table-format/ https://blogs.perficient.com/2024/11/16/a-comprehensive-guide-to-idmc-metadata-extraction-in-table-format/#respond Sun, 17 Nov 2024 00:00:27 +0000 https://blogs.perficient.com/?p=372086

Metadata Extraction: IDMC vs. PowerCenter

When we talk about metadata extraction, IDMC (Intelligent Data Management Cloud) can be trickier than PowerCenter. Let’s see why.
In PowerCenter, all metadata is stored in a local database. This setup lets us use SQL queries to get data quickly and easily. It’s simple and efficient.
In contrast, IDMC relies on the IICS Cloud Repository for metadata storage. This means we have to use APIs to get the data we need. While this method works well, it can be more complicated. The data comes back in JSON format. JSON is flexible, but it can be hard to read at first glance.
To make it easier to understand, we convert the JSON data into a table format. We use a tool called jq to help with this. jq allows us to change JSON data into CSV or table formats. This makes the data clearer and easier to analyze.

In this section, we will explore jq. jq is a command-line tool that helps you work with JSON data easily. It lets you parse, filter, and change JSON in a simple and clear way. With jq, you can quickly access specific parts of a JSON file, making it easier to work with large datasets. This tool is particularly useful for developers and data analysts who need to process JSON data from APIs or other sources, as it simplifies complex data structures into manageable formats.

For instance, if the requirement is to gather Succeeded Taskflow details, this involves two main processes. First, you’ll run the IICS APIs to gather the necessary data. Once you have that data, the next step is to execute a jq query to pull out the specific results. Let’s explore two methods in detail.

Extracting Metadata via Postman and jq:-

Step 1:
To begin, utilize the IICS APIs to extract the necessary data from the cloud repository. After successfully retrieving the data, ensure that you save the file in JSON format, which is ideal for structured data representation.
Step 1 Post Man Output

Step 1 1 Save File As Json

Step 2:
Construct a jq query to extract the specific details from the JSON file. This will allow you to filter and manipulate the data effectively.

Windows:-
(echo Taskflow_Name,Start_Time,End_Time & jq -r ".[] | [.assetName, .startTime, .endTime] | @csv" C:\Users\christon.rameshjason\Documents\Reference_Documents\POC.json) > C:\Users\christon.rameshjason\Documents\Reference_Documents\Final_results.csv

Linux:-
jq -r '["Taskflow_Name","Start_Time","End_Time"],(.[] | [.assetName, .startTime, .endTime]) | @csv' /opt/informatica/test/POC.json > /opt/informatica/test/Final_results.csv

Step 3:
To proceed, run the jq query in the Command Prompt or Terminal. Upon successful execution, the results will be saved in CSV file format, providing a structured way to analyze the data.

Step 3 1 Executing Query Cmd

Step 3 2 Csv File Created

Extracting Metadata via Command Prompt and jq:-

Step 1:
Formulate a cURL command that utilizes IICS APIs to access metadata from the IICS Cloud repository. This command will allow you to access essential information stored in the cloud.

Windows and Linux:-
curl -s -L -X GET -u USER_NAME:PASSWORD "https://<BASE_URL>/active-bpel/services/tf/status?runStatus=Success" -H "Accept: application/json"

Step 2:
Develop a jq query along with cURL to extract the required details from the JSON file. This query will help you isolate the specific data points necessary for your project.

Windows:
(curl -s -L -X GET -u USER_NAME:PASSWORD "https://<BASE_URL>/active-bpel/services/tf/status?runStatus=Success" -H "Accept: application/json") | (echo Taskflow_Name,Start_Time,End_Time & jq -r ".[] | [.assetName, .startTime, .endTime] | @csv" C:\Users\christon.rameshjason\Documents\Reference_Documents\POC.json) > C:\Users\christon.rameshjason\Documents\Reference_Documents\Final_results.csv

Linux:
curl -s -L -X GET -u USER_NAME:PASSWORD "https://<BASE_URL>/active-bpel/services/tf/status?runStatus=Success" -H "Accept: application/json" | jq -r '["Taskflow_Name","Start_Time","End_Time"],(.[] | [.assetName, .startTime, .endTime]) | @csv' /opt/informatica/test/POC.json > /opt/informatica/test/Final_results.csv

Step 3:
Launch the Command Prompt and run the cURL command that includes the jq query. Upon running the query, the results will be saved in CSV format, which is widely used for data handling and can be easily imported into various applications for analysis.

Step 3 Ver 2 Cmd Prompt

Conclusion
To wrap up, the methods outlined for extracting workflow metadata from IDMC are designed to streamline your workflow, minimizing manual tasks and maximizing productivity. By automating these processes, you can dedicate more energy to strategic analysis rather than tedious data collection. If you need further details about IDMC APIs or jq queries, feel free to drop a comment below!

Reference Links:-

IICS Data Integration REST API – Monitoring taskflow status with the status resource API

jq Download Link – Jq_Download

]]>
https://blogs.perficient.com/2024/11/16/a-comprehensive-guide-to-idmc-metadata-extraction-in-table-format/feed/ 0 372086
A Step-by-Step Guide to Extracting Workflow Details for PC-IDMC Migration Without a PC Database https://blogs.perficient.com/2024/11/08/a-step-by-step-guide-to-extracting-workflow-details-for-pc-idmc-migration-without-a-pc-database/ https://blogs.perficient.com/2024/11/08/a-step-by-step-guide-to-extracting-workflow-details-for-pc-idmc-migration-without-a-pc-database/#respond Fri, 08 Nov 2024 06:29:05 +0000 https://blogs.perficient.com/?p=371403

In the PC-IDMC conversion process, it can be challenging to gather detailed information about workflows. Specifically, we often need to determine:

  • The number of transformations used in each mapping.
  • The number of sessions utilized within the workflow.
  • Whether any parameters or variables are being employed in the mappings.
  • The count of reusable versus non-reusable sessions used in the workflow etc.

To obtain these details, we currently have to open each workflow individually, which is time-consuming. Alternatively, we could use complex queries to extract this information from the PowerCenter metadata in the database tables.

This section focuses on XQuery, a versatile language designed for querying and extracting information from XML files. When workflows are exported from the PowerCenter repository or Workflow Manager, the data is generated in XML format. By employing XQuery, we can effectively retrieve the specific details and data associated with the workflow from this XML file.

Step-by-Step Guide to Extracting Workflow Details Using XQuery: –

For instance, if the requirement is to retrieve all reusable and non-reusable sessions for a particular workflow or a set of workflows, we can utilize XQuery to extract this data efficiently.

Step 1:
Begin by exporting the workflows from either the PowerCenter Repository Manager or the Workflow Manager. You have the option to export multiple workflows together as one XML file, or you can export a single workflow and save it as an individual XML file.

Step 1 Pc Xml Files

Step 2:-
Develop the XQuery based on our specific requirements. In this case, we need to fetch all the reusable and non-reusable sessions from the workflows.

let $header := "Folder_Name,Workflow_Name,Session_Name,Mapping_Name"
let $dt := (let $data := 
    ((for $f in POWERMART/REPOSITORY/FOLDER
    let $fn:= data($f/@NAME)
    return
        for $w in $f/WORKFLOW
        let $wn:= data($w/@NAME)
        return
            for $s in $w/SESSION
            let $sn:= data($s/@NAME)
            let $mn:= data($s/@MAPPINGNAME)
            return
                <Names>
                    {
                        $fn ,
                        "," ,
                        $wn ,
                        "," ,
                        $sn ,
                        "," ,
                        $mn
                    }
                </Names>)
    |           
    (for $f in POWERMART/REPOSITORY/FOLDER
    let $fn:= data($f/@NAME)
    return          
        for $s in $f/SESSION
        let $sn:= data($s/@NAME)
        let $mn:= data($s/@MAPPINGNAME)
        return
            for $w in $f/WORKFLOW
            let $wn:= data($w/@NAME)
            let $wtn:= data($w/TASKINSTANCE/@TASKNAME)
            where $sn = $wtn
            return
                <Names>
                    {
                        $fn ,
                        "," ,
                        $wn ,
                        "," ,
                        $sn ,
                        "," ,
                        $mn
                    }
                </Names>))
       for $test in $data
          return
            replace($test/text()," ",""))
      return
 string-join(($header,$dt), "
")

Step 3:
Select the necessary third-party tools to execute the XQuery or opt for online tools if preferred. For example, you can use BaseX, Altova XMLSpy, and others. In this instance, we are using Basex, which is an open-source tool.

Create a database in Basex to run the XQuery.

Step 3 Create Basex Db

Step 4: Enter the created XQuery into the third-party tool or online tool to run it and retrieve the results.

Step 4 Execute XqueryStep 5:
Export the results in the necessary file extensions.

Step 5 Export The Output

Conclusion:
These simple techniques allow you to extract workflow details effectively, aiding in the planning and early detection of complex manual conversion workflows. Many queries exist to fetch different kinds of data. If you need more XQueries, just leave a comment below!

]]>
https://blogs.perficient.com/2024/11/08/a-step-by-step-guide-to-extracting-workflow-details-for-pc-idmc-migration-without-a-pc-database/feed/ 0 371403
Buckle up, we’re headed to Informatica World 2024! https://blogs.perficient.com/2024/05/16/buckle-up-were-headed-to-informatica-world-2024/ https://blogs.perficient.com/2024/05/16/buckle-up-were-headed-to-informatica-world-2024/#respond Thu, 16 May 2024 21:46:16 +0000 https://blogs.perficient.com/?p=338879

On the Road to Vegas!

We have our passes and are ready to hit the road for Informatica World 2024! This year’s conference is hosted in the heart of Las Vegas at Mandalay Bay Resort and Casino, May 20-23. Informatica World is an annual event that brings together over 2,000 experts to network, collaborate, and strategize new use cases on the Informatica platform.

Informatica empowers customers to maximize their data and AI capabilities. When leveraged properly, the Informatica Intelligent Data Management Cloud (IDMC) will uncover a clear path to success.

Thoughts from Leadership:

“It is always a pleasure to attend a sales conference, but when it comes to Informatica World, the energy is contagious, and the insights gained are nothing short of transformative.”

Informatica Practice Director, Atul Mangla

“I am ecstatic to be apart of the group representing Perficient at Informatica World 2024 and am looking forward to connecting with new folks and delving deep into the ever-expanding capabilities Informatica has to offer.”

Portfolio Specialist, Kendall Reahm

Blog Graphic

The Attendees:

Our team is looking forward to submersing themselves in a full schedule of keynotes, summits, and breakouts to further hone in our expertise with the Informatica platform. Perficient will be represented by:

  1. Santhosh Nair – Data Solutions GM/ AVP
  2. Atul Mangla – Informatica Practice Director
  3. Kendall Reahm – Portfolio Specialist
  4. Scott Vines – Portfolio Specialist

The Perficient thought leaders would love to meet you during the event! Reach out and let us know if you are coming so we can collaborate on innovative solutions for you and your customers leveraging the power of Perficient and Informatica.

See you there!

 

As an award winning Platinum Enterprise Partner, our team helps businesses rapidly scale enterprise services and collaboration tools to create value for employees and customers. We offer a wide range of solutions tailored to the unique needs of each customer.

Learn more about the Perficient and Informatica practice here.

]]>
https://blogs.perficient.com/2024/05/16/buckle-up-were-headed-to-informatica-world-2024/feed/ 0 338879
Azure SQL Server Performance Check Automation https://blogs.perficient.com/2024/04/11/azure-sql-server-performance-check-automation/ https://blogs.perficient.com/2024/04/11/azure-sql-server-performance-check-automation/#respond Thu, 11 Apr 2024 13:37:29 +0000 https://blogs.perficient.com/?p=361522

On Operational projects that involves heavy data processing on a daily basis, there’s a need to monitor the DB performance. Over a period of time, the workload grows causing potential issues. While there are best practices to handle the processing by adopting DBA strategies (indexing, partitioning, collecting STATS, reorganizing tables/indexes, purging data, allocating bandwidth separately for ETL/DWH users, Peak time optimization, effective DEV query Re-writes etc.,), it is necessary to be aware of the DB performance and consistently monitor for further actions. 

If Admin access is not available to validate the performance on Azure, building Automations can help monitor the space and necessary steps before the DB causes Performance issues/failures. 

Regarding the DB performance monitoring, IICS Informatica Job can be created with a Data Task to execute DB (SQL Server) Metadata tables query to check for the performance and Emails can be triggered once Free space goes below the threshold percentage (ex., 20 %). 

IICS Mapping Design below (scheduled Hourly once). Email alerts would contain the Metric percent values. 

                        Iics Mapping Design Sql Server Performance Check Automation 1

Note : Email alerts will be triggered only if the Threshold limit exceeds. 

                                             

IICS ETL Design : 

                                                     

                     Iics Etl Design Sql Server Performance Check Automation 1

IICS ETL Code Details : 

 

  1. Data Task is used to get the Used space of the SQL Server performance (CPU, IO percent).

                                          Sql Server Performance Check Query1a

Query to check if Used space exceeds 80% . I Used space exceeds the Threshold limit (User can set this to a specific value like 80%), and send an Email alert. 

                                                            

                                         Sql Server Performance Check Query2

If Azure_SQL_Server_Performance_Info.dat has data (data populated when CPU/IO processing exceeds 80%) the Decision task is activated and Email alert is triggered. 

                                          Sql Server Performance Result Output 1                                          

Email Alert :  

                                            Sql Server Performance Email Alert

]]>
https://blogs.perficient.com/2024/04/11/azure-sql-server-performance-check-automation/feed/ 0 361522
SQL Server Space Monitoring https://blogs.perficient.com/2023/11/28/sql-server-space-monitoring/ https://blogs.perficient.com/2023/11/28/sql-server-space-monitoring/#respond Wed, 29 Nov 2023 05:47:24 +0000 https://blogs.perficient.com/?p=350339

On Operational projects that involves heavy data volume load on a daily basis, there’s a need to monitor the DB Disk Space availability. Over a period of time, the size grows occupying the disk space. While there are best practices to handle the size by adopting strategies of Purge for outdated data and add buffer/temp/data/log space to address the growing needs, it is necessary to be aware of the Disk space and consistently monitor for further actions.

If Admin access is not available to validate the Available, building Automations can help monitor the space and necessary steps before the DB causes Performance issues/failures.

Regarding the DB Space monitoring, IICS Informatica Job can be created with a Data Task to execute DB (SQL Server) Metadata tables query to check for the Available Space and Emails can be triggered once Free space goes below the threshold percentage (ex., 20 %).

IICS Mapping Design below (scheduled Daily once). Email alerts would contain the Metric percent values.

 

Capture

 

Note : Email alerts will be triggered only if the Threshold limit exceeds.

 

IICS ETL Code Details :

 

  1. Data Task is used to get the Used space of the SQL Server Log and Data files.

Capture

Capture

 

Query to check if Used space exceeds 80% . I Used space exceeds the Threshold limit (User can set this to a specific value like 80%), and send an Email alert.

 

Capture

 

If D:\Out_file.dat has data (data populated when Used space exceeds 80%) the Decision task is activated and Email alert is triggered.

 

 

]]>
https://blogs.perficient.com/2023/11/28/sql-server-space-monitoring/feed/ 0 350339
Windows Folder/Drive Space Monitoring https://blogs.perficient.com/2023/11/28/windows-folder-drive-space-monitoring/ https://blogs.perficient.com/2023/11/28/windows-folder-drive-space-monitoring/#respond Wed, 29 Nov 2023 05:38:23 +0000 https://blogs.perficient.com/?p=350216

Often there’s a need to monitor the OS Disk Drive Space availability with the Drive holding ETL operational files (log, cache, temp, bad files etc.). Over a period of time, the # of files grows occupying the disk space. While there are best practices to limit the # of operational files and clear them from the Disk on regular basis (via Automations), it is recommended to be aware of the available space.

If Admin access is not available to validate the Available space and if the ETL Server is on a Remote machine, building Automations can help monitor the space and necessary steps before ETL causes Performance issues/failures.

Regarding the OS Folder/Drive Space monitoring, IICS Informatica Job can be created with a Command Task to execute Windows commands via Batch scripts to check for the Available Space and Emails can be triggered once Free space goes below the threshold percentage (ex., 20 %).

IICS Taskflow Design below (can be scheduled Bi-Weekly or Monthly according to the requirements). Email alerts would have the Free space percent value.

 

Capture

 

 

 

 

 

Note : Email alerts will be triggered only if the Threshold limit exceeds.

 

IICS ETL Code Details :

 

  1. Windows Command Task is used to get the Free space of the OS Drive/Network Drive/Folder on which ETL is installed and the log files are held.

 

Capture

 

 

D:\space_file_TGT.dat Content: (Drive Name, Free space, Overall Space)

D:,11940427776,549736935424

 

D:\Out_file.dat Content : (Drive Name, Free space[GB], Overall Space[GB], Flag[if Free space <25 set as Alert],Used space percent

D:,11940427776,549736935424,ALERT,98%

  1. IICS Data Task to populate D:\Out_file.dat

If D:\Out_file.dat has data (data populated when Free space <25 ) the Decision task is activated and Email alert is triggered.

Email Alert :

 

Capture

 

]]>
https://blogs.perficient.com/2023/11/28/windows-folder-drive-space-monitoring/feed/ 0 350216
Informatica PowerCenter Overview: Part 1 https://blogs.perficient.com/2023/05/15/informatica-powercenter-overview-part-1/ https://blogs.perficient.com/2023/05/15/informatica-powercenter-overview-part-1/#comments Mon, 15 May 2023 10:57:09 +0000 https://blogs.perficient.com/?p=332865

what is ETL?

ETL is a process that extracts the data from different source systems, then transforms the data (like applying calculations, concatenations, etc.), and finally loads the data into the Data Warehouse system. The full form of ETL is Extract, Transform, and Load.

Etl

What is a data warehouse (DW)?

A Data Warehouse (DW) is a relational database that is designed for query and analysis rather than transaction processing. It includes historical data derived from transaction data from single and multiple sources. The purpose of a data warehouse is to connect and analyze business data from heterogeneous sources. Data warehouses are at the core of business intelligence solutions, which analyze and report on data.

Different types of ETL Tools available?

There are multiple ETL tools available in the market such as Adeptia Connect, Alooma Platform, CData Driver Technologies, Fivetran, IBM InfoSphere Information Server, Informatica Intelligent Data Platform, Matillion ETL, SQL Server Integration Services (SSIS), Oracle Data Integration Cloud Service, Talend Open Studio, SAS Data Management, and others.

In this blog, we will be discussing Informatica PowerCenter:

What is Informatica?

Informatica is a data integration tool based on ETL architecture. Data integration across business applications, data warehousing, and business intelligence are its main applications. Informatica has bust-in functionalities to connect with various source systems like databases, file systems, or SaaS-based applications using configurations, adapters and in-built connectors. Data is extracted from all systems through Informatica, transformed on the server, and fed into data warehouses.

Example: It is possible to connect to many database servers; Oracle and Microsoft SQL Server databases are both connected here. The data can be combined with that of another system.

Why We Need Informatica?

  1. If we need to perform some operations on the data at the backend in a data system, then we need the Informatica.
  2. To modify, update, or clean up the data based on some set of rules, we need Informatica.
  3. By using Informatica, is accessible for the loading of bulk data from one system to another.

Components in Informatica.

Informatica consists of two types of components:

  • Server component: Repository service, integration service, Domain, Node
  • Client component: Designer, workflow, monitor, repository client

Informatica ETL tool has the below services/components, such as:

  • Repository Service: It is responsible for maintaining Informatica metadata and provides access to the same to other services. The PowerCenter Repository Service manages connections to the PowerCenter repository from repository clients. The Repository Service is a separate, multi-threaded process that retrieves, inserts, and updates metadata in the repository database tables. The Repository Service ensures the consistency of metadata in the repository.

 

  • Integration Service: This service helps in the movement of data from sources to targets. The PowerCenter Integration Service reads workflow information from the repository. The Integration Service connects to the repository through the Repository Service to fetch metadata from the repository. The Integration Service can combine data from different platforms and source types. For example, you can join data from a flat file and an Oracle source. The Integration Service can also load data to different platforms and target types.

 

  • Reporting Service: This service generates the reports. After you create a Reporting Service, you can configure it. Use the Administrator tool to view or edit the Reporting Service properties. To view and update properties, select the Reporting Service in the Navigator. In the Properties view, click Edit in the properties section that you want to edit. If you update any of the properties, restart the Reporting Service for the modifications to take effect.

 

  • Repository Manager: Repository Manager is a GUI-based administrative client component, which allows users to create new domains and used to organize the metadata stored in the Repository. The metadata in the repository is organized in folders, and the user can navigate through multiple folders and repositories as shown in the image below. 

Repository Manager

 

  • Informatica Designer: Informatica PowerCenter Designer is a graphical user interface (GUI) for creating and managing PowerCenter objects such as source, target, Mapplets, Mapping, and transformations. To develop ETL applications, it provides a set of tools known as “Mapping”. PowerCenter Designer creates mappings by importing source tables from the database with the Source analyzer, target tables from the database with the Target designer, and transforming these tables.

Designer

  • Workflow Manager: The Workflow Manager allows for the creation and completion of workflows and other tasks. You must first create tasks, such as a session containing the mapping you make in the Designer before you can establish a process. You then connect tasks with conditional links to specify the order of execution for the tasks you created.

The Workflow Manager consists of three tools to help you develop a workflow:

    1. Task Developer – Use the Task Developer to create tasks you want to run in the workflow.
    2. Workflow Designer – Use the Workflow Designer to create a workflow by connecting tasks with links. You can also create tasks in the Workflow Designer as you develop the workflow.
    3. Worklet Designer – Use the Worklet Designer to create a worklet. 

Workflow Manager

 

  • Workflow Monitor: Informatica Workflow Monitor makes it easy to track how tasks are being completed. Generally, Informatica Power Centre helps you to track or monitor the Event Log information, list of executed Workflow, and their execution time in detail.

The Workflow Monitor consists of following windows:

  • Navigator window – Displays monitored repositories, servers, and repositories objects.
  • Output window – Displays messages from the Integration Service and Repository Service.
  • Time window – Displays the progress of workflow runs.
  • Gantt Chart view – Displays details about workflow runs in chronological format.
  • Task view – Displays details about workflow runs in a report format.

 Workflow Monitor

So, in Part 1 we have seen an overview of Informatica PowerCentre and basic understanding of all the available tools, in the next blog we will be discussing about the various transformations available in Informatica PowerCentre.

Please share your thoughts and suggestions in the space below, and I’ll do my best to respond to all of them as time allows.

for more such blogs click here

Happy Reading!

]]>
https://blogs.perficient.com/2023/05/15/informatica-powercenter-overview-part-1/feed/ 9 332865
Implementation of SCD type 1 in Informatica PowerCenter https://blogs.perficient.com/2023/04/19/implementation-of-scd-type-1-in-informatica-powercenter/ https://blogs.perficient.com/2023/04/19/implementation-of-scd-type-1-in-informatica-powercenter/#comments Wed, 19 Apr 2023 14:57:41 +0000 https://blogs.perficient.com/?p=332952

What is a Slowly Changing Dimension?

A Slowly Changing Dimension (SCD) is a dimension that stores and manages both current and historical data over time in a data warehouse. It is considered and implemented as one of the most critical ETL tasks in tracking the history of dimension records.

Type 1 SCDs – Overwriting

In a Type 1 SCD the new data overwrites the existing data. Thus, the existing data is lost as it is not stored anywhere else. This is the default type of dimension you create. You do not need to specify any additional information to create a Type 1 SCD.

 Slowing Chaining Dimension Type 1:

Slowing Chaining Dimension Type 1 is used to maintain the latest data by comparing the existing data from the target. It will insert the new records. And update the new data by overwriting the existing data for those records that are updated. All the records contain current data only.

It is used to update the table when you do not need to keep any previous versions of data for those records (No historical records stored in the table).

Example:

This sample mapping is to showcase how SCD Type 1 works and in this exercise, we do not compare column to column for updates to check if is there any change in the existing record. In this we are only checking for Primary Key if exists then Update else Insert as new.

Please connect & open the Repository Folder where you want to create mapping and workflow.

1. Connect and Open the folder if not already opened.

2. Select Tools –> Mapping Designer

0

3. Select Mappings –> Create –> Enter the mapping name you want to create. Then click on “OK”.

4. Drag & drop the required source instance to mapping.

1

5. Drag & Drop target table to mapping (take 2 instances one for Insert and the other for update process)

4

6. Add Lookup to the Mapping. The lookup instance is on the target table to check whether the incoming records exist or not, it is not then inserts else update them. (Here the lookup is connected one).

Target Table

Here we need to look up the target table so select the location of the lookup table as “Target” and select the table from the list under the Targets folder as shown below.

5

Then Click on “OK”

7. The lookup instance will be added to the mapping as shown below.

8. Now Drag the required columns from the Source qualifier to look up the transformation as below.

6

9. To define the lookup condition, double click on lookup transformation –> go to the condition tab

7

 

10. Drag Lookup Primary Key (from lookup) and all other columns dragged from source qualifier to lookup are dragged to Router Transformation to route/separate records for insert and update.

Output Variable Ports

And we will create two output variable ports for new records and updated records.

o_new_records = IIF(ISNULL(lkp_DEPT_ID),TRUE,FALSE)

o_updated_records  = IIF(((DEPT_ID= lkp_DEPT_ID) AND

(DEPT_NAME!= lkp_DEPT_NAME) OR

(DEPT_LOC != lkp_DEPT_LOC) OR

(DEPT_HEAD != lkp_DEPT_HEAD)),TRUE,FALSE)

9

11. Condition to separate records for Insert and Update.

Double-click on Router Transformation — > go to the Group tab. To create 2 groups one for insert condition and the other for update condition.

For NEW_RECORD Group: o_new_records
Note: If case lookup DEPT_ID is null means there is no matching record in THE target so they go for Insert

For UPDATED_RECORDS Group: o_updated_records
Note: If case lookup DEPT_ID is not null means there is a matching record in the target so they go for Update

10

12. From NEW_RECORD Group of Router Transformation mapping columns to Target Table instance taken for Insert

(Note: Default incoming rows type is the insert that is why here we are not using update strategy for insert flow)

8

13. Add an update strategy to flag incoming records for update purposes. Drag the required column from Router Transformation –                       UPDATE_RECORD type Group as shown above

14. Double click on Update Strategy –> go to the Property tab

under “Update Strategy Expression” write: DD_UPDATE as shown below

Update

15. Map the required columns from Update Strategy to Target Instance taken for Update flow.

12

3

16. Create the workflow for the above mapping.

17. Connect and open the folder under which you have created the mapping.

18. Select “Workflows” from the menu –> Click on “Create…” as shown below

13

19. It will Pop Up on the below screen. Entry the name for the workflow.

14

Then Click on “OK”.

20. Create the session for the mapping by clicking the icon in the below screen. It will Pop up the Mappings list and from the list the mapping for which you want to create this session. As shown below.

15

Now, the session got created, then link the session with the start icon as below.

21. Double-click on Session then goes to –> Properties tab:

23

By default, Treat source rows as will be inserted but whenever you will add an update strategy in the mapping. Automatically Treat source rows as will be changed to Data Driven.

22. Then go to –> Mapping tab to assign/map source, target, and lookup database connection information

17

23. Go to the Sources folder in the left-side navigator, then select the source (SQ_DEPARTMENT_DETAILS) to assign a database connection. Click on the down arrow button to get the list of connections available for this repository and select the required one from the list. ex: oracle is the connection name pointing to the Oracle database in this example,

18

24. similarly go to the Target folder in the left-side navigator, then select the target (DEPARTMENT_CURRENT_insert) to assign a database connection. Click on the down arrow button to get the list of connections available for this repository and select the required one from the list. ex: Oracle is the connection name pointing to the Practice database in this example,

In properties, session select Insert values to insert data into the target

19

25. Then select the target folder in the left side navigator, then select (DEPARTMENT_CURRENT_insert1) to assign a database connection. Click on the down arrow button to get the list of connections available for this repository and select the required one from the list. ex: Oracle is the connection name pointing to the HR database in this example,

In properties, the session selects update values for updating data into the target

20

26. And check all the transformations Then click on “Apply” and “Ok”.

27. Save the session and workflow. then run the session/workflow.

28. When you run the session first time all the records will be inserted.

29. In the below screen, I have INSERTED one record and MODIFIED record no 2, 4 and then you run the job a second time they will be updated.

21

30. In the below screen records highlighted with the red box are modified/updated records and highlighted in the green box are newly inserted records.

Last

          31. Other records which are not highlighted are overwritten records as they have no changes.

This is all about the Implementation of SCD type 1 in Informatica PowerCenter. I hope you enjoyed reading this post and found it to be helpful.

For more blogs click here.

Happy Reading!

]]>
https://blogs.perficient.com/2023/04/19/implementation-of-scd-type-1-in-informatica-powercenter/feed/ 3 332952
Performance Tuning Guidelines – Informatica PowerCenter https://blogs.perficient.com/2023/02/25/performance-tuning-guidelines-informatica-powercenter/ https://blogs.perficient.com/2023/02/25/performance-tuning-guidelines-informatica-powercenter/#respond Sat, 25 Feb 2023 09:58:19 +0000 https://blogs.perficient.com/?p=328827

Quite often, while building the Data Integration Pipeline, Performance is a critical factor. The factors below are vital for following the guidelines while working on ETL processing with Informatica PowerCenter.

The following items are to be considered during ETL DEV:

  • Pre-Requisite Checks and Analysis
  • Basic Tuning Guidelines
  • Additional Tuning Practices

Tuning Approach

Pre-Requisite Checks/Analysis 

Before we get into subjecting an ETL Mapping against Performance Improvements, below are steps to be adopted :

  • Deep Dive into the Mapping to gather Basic Info.
    • Complexity of the Mapping (# of SRCs/TGTs, Transformations, Technical Logic)
    • Design of the Mapping (End-End Flow, Single/Multiple Pipelines)
    • Whether Best Practices followed
  • Verify the As-Is Metrics of the Mapping
    • Data Volume (SRC/TGT)
    • Duration of the Job Completion
    • Throughput
    • Busy Percentage of the Threads (Reader/Writer/Transformation)
    • Collect Performance Statistics
  • Ensure the ETL Server/System is not the reason for processing slowness
    • Are there frequent Network Connectivity issues?
    • Does the ETL System/Server has required H/W Capabilities?
    • Does the ETL Metadata DB have Enough Space?
    • Whether the System has Accumulated Log/Cache files blocking Server space?
    • DBs Slow with READ/WRITE?

After ensuring the above prerequisites are taken care of and bottlenecks identified, if the ETL DEV is recognized as the root cause for slowness, Tuning practices can be applied to the Mappings if we expect a significant improvement to meet the SLAs and other Business benefits.

Basic Tuning Guidelines

Basic Guidelines are listed below :

  • Design Perspective
    • Bring relevant and required fields on subsequent transformations
    • Perform Incremental Extracts to limit processing
    • Use Informatica CDC Drivers to process only Changed Data
    • Filter Data as early in the Pipelines
    • Limit the Data via Equi-Joins (JNR) upfront before Left-Joins (JNR) on Large Tables
  • DB Perspective
    • Build Indexes (High Volume Tables on Frequently used Joins/Predicates)
    • Create DB Partitions (for Large Fact Tables)
    • Collect STATISTICS
    • DB performs Faster processing (Complex Transformation Logic) than ETL
  • Delegation Perspective
    • Use PDO if DB Server has appreciable Computing Abilities
    • If DB Server has High Workload, push Functions Logic to Informatica Transformations
    • If DB has difficulty with Aggregations/Sorting, use Informatica Transformations
  • Space Perspective
    • Have a Retention period for Log/Cache files
    • Increase SRT/AGG/JNR Cache Size and DTM Buffer Size
  • Transformations/Load Perspective
    • Sorted Input data before LKP/AGG/JNR Transformations
    • JNR with Master Source having less records and distinct values
    • Consider BULK Load and External Loaders for Data Dump (after removing Index)
    • Use LKP Persistent Cache for re-use requirements
    • Datatype consistency helps ETL operating with SRT, AGG, and JNR
    • Optimize LKPs by looking up only relevant data (Override Filters) instead of the entire table
    • Avoid LKP Override sort for small tables
    • Use UPD Strategy Transformation (only if necessary), can go for session-level updates
    • If LKP is on a high volume table and causes performance issues, consider JNR Transformation

Additional Tuning Practices

Additional Tuning Practices are listed below :

  • Use Informatica Partitions (Pass Through/Key Range/Hash Key etc.) if the data volume is high
  • Do not use SRC and TGT as the same DB Table. Do an SRC – File TGT. Then FILE – DB TGT
  • Do not perform all ETL Operations in 1 Mapping. Divide ETL works with a Series of Mappings
  • Use Concurrent Workflow Exec setting to enable parallel loads with different Params
  • Process ETL in multiple batches (ex. 2 times a day) to release the Table post load
  • If Complex ETL logic causes slowness, use FILE as TGT. Then 1:1 Load from FILE-TGT DB
  • Monitor Storage Space (Logs), and use ETL Automation to clear files by Frequency (Mly and Qly)

Conclusion

On a high level, below are the inferences :

  • Tuning need not be performed on every ETL mapping. Only those ETL jobs that are pain points to meeting Data Extraction and Loads SLAs be considered potential candidates for further investigations and tuning.
  • DB Query optimization also plays a crucial role with SQL Overrides when used.
  • Delegate load b/w DB and ETL Servers.
  • Optimize ETL Design by following the Best Practices.
  • Monitor Storage Space and Computing Abilities.
  • Consider deploying Informatica Nodes on a GRID for High Availability and Load Balancing.
]]>
https://blogs.perficient.com/2023/02/25/performance-tuning-guidelines-informatica-powercenter/feed/ 0 328827
New OneStream and Informatica Alliance Manager: Juliette Collins https://blogs.perficient.com/2022/08/23/new-onestream-alliance-manager-juliette-collins/ https://blogs.perficient.com/2022/08/23/new-onestream-alliance-manager-juliette-collins/#respond Tue, 23 Aug 2022 21:41:49 +0000 https://blogs.perficient.com/?p=316855

At Perficient, it’s our people who make a difference, and we’re excited to introduce the newest addition to our team. Meet OneStream and Informatica Partner Alliance Manager, Juliette Collins!

With over twenty years of sales and relationship management experience, Juliette is using her expertise to build meaningful bonds and drive even bigger, better Perficient partnerships.

So, I sat down with our latest team member to gain insight as to what her vision is for this new role.

What is your overall goal as the new OneStream Alliance Lead? How do you plan to achieve that goal?

Juliette: The overall goal is to pave the way for new and add on revenue for both Perficient and the partners I manage. To do that, I need to build, nurture and support relationships along the way. My goal is to educate both Perficient sellers and partner contacts on what a successful partnership offers and expand on the thoughts and ideas that are available from the experts on both sides, making sure that I’m connecting the dots along the way.

What is your main responsibility within this role?

Juliette: My role is to create and execute a go to market strategy with my partners and to support events, campaigns. activities and marketing efforts that brand us together and make Perficient the go-to partner. Create, execute, support. The trifecta.

How do you plan to promote Perficient and our partnerships with OneStream and Informatica?

Juliette: Shout it from the rooftops! I plan to educate people on the outrageous value of OneStream as the CPM platform for the Office of Finance. OneStream can transform the office of the CFO to the point where it’s been life changing, freeing people to leave the office on time or to take time off. The power of Perficient and OneStream make a difference to the quality of the environment that people in the Office of Finance are working in. Sharing these ideas with Perficient about the transformation that OneStream brings to the Office of Finance is my top priority.

What are you most excited about as the new OneStream Alliance Manager?

Juliette: The opportunities. The opportunity for growth right now is limitless. Being a OneStream diamond partner already positions us as a trusted, successful partner in that ecosystem. What’s next is continuing to grow that relationship from east to west, north to south, and potentially globally. It’s exciting to know that I can drive the energy and revenue growth by supporting our team in their efforts to implement OneStream.

Do you have any advice for our consultants and sellers?

Juliette: The fortune is in the follow up.

It’s important to remember to go back to sales 101, to the basics, to the beginning. I’m there to support those efforts when looking at a potential opportunity or discussing a deal or needing to gather information to reply to someone. I want to support the follow up and make sure that we do what we’re saying we’re going to do when we say we’re going to do it. That’s how we earn the respect to be the first call that someone’s going to make. That’s how we earn the opportunity to be at the table. Together, we have a lot to do, let’s get some time on the calendar to get it done.

About Our OneStream Practice

Perficient is a OneStream Diamond Partner and Authorized Training provider. We help companies of today become the companies of tomorrow by implementing the next-generation capabilities of OneStream.

Perficient has more than 20 years of experience delivering Corporate Performance Management solutions. Our CPM Practice includes an advisory services capability focused on optimizing financial business processes while aligning workflows and technology to best practices.

We have successfully delivered more than 1500 projects for leading brands across multiple industries.

Read more about our OneStream partnership.

People of Perficient

It’s no secret, our success is because of our people. No matter the technology or time zone, our colleagues are committed to delivering innovative, end-to-end digital solutions for the world’s biggest brands, while bringing a collaborative spirit to every interaction. Meet more of our Perficient team.

We’re always seeking the best and brightest to work with us. Check out our careers page to learn more!

]]>
https://blogs.perficient.com/2022/08/23/new-onestream-alliance-manager-juliette-collins/feed/ 0 316855
What is Supplier 360 and What Does It Look Like https://blogs.perficient.com/2022/08/23/what-is-supplier-360-and-what-does-it-look-like/ https://blogs.perficient.com/2022/08/23/what-is-supplier-360-and-what-does-it-look-like/#respond Tue, 23 Aug 2022 20:31:48 +0000 https://blogs.perficient.com/?p=316981

The Challenge in Supply Chain

Let’s face it. You and your suppliers are linked together and what they do affects you. So if your supplier data is fragmented across dozens or even hundreds of systems, it’s probably costing you. What is causing these supply chain leaks?

Data stored in various places like supply chain management software, ERP systems, accounts payable, and spreadsheets all has its own structure and data model. It makes it impossible to access a trusted view of all your suppliers and sub-suppliers. This leads to leaks in your supply chain.

The good news? It’s largely preventable.

What Are Supply Chain Leaks

Leaks are thousands of inefficiencies, blind spots, and missed opportunities spread across your supplier universe. For example, manual processes, lack of collaboration, delivery problems, missed discounts, and more.

In order to resolve these leaks and prevent new ones, you must bring everything you know about your suppliers together in one place to create a resilient and financially sustainable supply chain with a 360-degree view. This can be achieved through 360 degree supplier information management or Supplier 360.

How Supplier 360 Resolves Leaks

Informatica’s Supplier 360 is a single system that provides clean, consistent, and connected information to help you make better business decisions and implement processes that save you money.

The leaks in your supply chain can be avoided using this data-driven solution by providing you with:

  • Centrally managed, trusted, governed, and relevant data across the entire business
  • A single view of suppliers that’s visible to all buying teams and applications
  • Managed dynamic supplier hierarchies allowing you to stay on top of changes
  • Reimagined and streamlined collaboration
  • End-to-end visibility into your value chain
  • Automated workflows built around supplier data

The Supplier 360 approach allows you to become more efficient and deliver industry disruptive results while saving money and gaining efficiency.

What Supplier 360 Looks Like in Action

360-degree supplier information management is built on four main pillars that are critical to gaining a single supplier view that you can trust.

  1. Proactive Supplier Data Governance

In order to achieve a 360 view of supplier information management, your company must be committed to treating supplier data like the important strategic business asset it is. Senior Management and major stakeholders all must buy-in.

  1. Intelligent Data Quality

The data inside each supplier-facing application must be trusted, accessible, and timely. If the supplier information in your applications is inaccurate and incomplete, it will pollute your master data as well.

  1. Intelligent Master Data Management

The key to a successful Supplier 360 is the ability to create a single version of the truth about every supplier. Then to be able to automatically share it with any application that needs it, using a consistent data schema that defines how the data is organized and related. As a result, this enables you to manage different views of supplier hierarchies to meet the needs of different buying teams and departments.

  1. Data Integration

What this all comes down to is data integration. You must be able to move data from your dozens or hundreds of source systems into a master data management (MDM) system before you can synchronize the mastered supplier data with target applications and data warehouses.

Apply all four of the pillars above to the highly specific domain of supplier information and you’ll have 360-degree Supplier Information Management. If you apply a solution that is missing just one of these pillars you’ll have a half-cooked infrastructure that is bound to let you down. The good news is that it doesn’t matter where you are today or how fragmented your supplier information is right now. With Supplier 360 you can get where your business needs to be.

Partner With Our Supplier 360 Experts

With our successful experience in supply chain and data management, we have helped numerous clients achieve growth, reduce costs and risk, as well as increase productivity in their operations.

Get in contact with an expert!

]]>
https://blogs.perficient.com/2022/08/23/what-is-supplier-360-and-what-does-it-look-like/feed/ 0 316981
A MDM Success Story: Streamlining Claims Processing and Payments With Reliable, Centralized Data https://blogs.perficient.com/2022/06/29/a-mdm-success-story-streamlining-claims-processing-and-payments-with-reliable-centralized-data/ https://blogs.perficient.com/2022/06/29/a-mdm-success-story-streamlining-claims-processing-and-payments-with-reliable-centralized-data/#comments Wed, 29 Jun 2022 16:15:50 +0000 https://blogs.perficient.com/?p=311874

Does this sound familiar: you’re running multiple point-of-care systems, each with redundant data points collected. But what happens when there are small variations in how data is captured and processed across these systems? As a healthcare provider client of ours discovered, inconsistent data can reduce revenue realization. It caused claims to be improperly matched and processed… then denied – a stressful, frustrating, and business-impacting experience.

We partnered with the provider to consolidate their siloed data. Our Agile practitioners recognized the organization’s challenge as a prime use case for Informatica, an automated solution that would compare, streamline, and harmonize patient data in order to establish a single, reliable source of truth and boost claims processing efficiencies.

READ MORE: We’re A Cloud and On-Premises Certified Informatica Partner  Master Data Management: Modern Use Cases for Healthcare

Increasing Revenue and Improving Cycle Processing With a Single Source of Truth

To deliver a single source of truth, our team of industry and technical experts partnered with the provider to achieve the following:

  • Created a shared vision and enabled teams for the new solution’s rollout.
  • Built an automated solution on Informatica’s Data Quality and Master Data Management Hub to ingest, cleanse, compare, and consolidate data between various internal and external sources, proactively identifying mismatches for quick resolution.
  • Established a Center for Enablement (C4E) to oversee information governance and resolve process gaps across multiple workstreams.

Data quality improvements achieved with this solution significantly reduced the cost and complexity associated with integrating newly acquired hospitals, increased the frequency of successfully billed patient claims, and limited confusion and frustration associated with providers’ attempts to understand and reverse denied claims.

And the results are impressive. Want to learn more? Check out the complete success story here.

DISCOVER EVEN MORE SOLUTIONS: Master Data Management: Modern Use Cases for Healthcare

Healthcare MDM Solutions

With Perficient’s expertise in life scienceshealthcare, Informatica, and Agile leadership, we equipped this multi-state integrated health network with a modern, sustainable solution that vastly accelerated the revenue realization cycle.

Have questions? We help the healthcare organizations navigate healthcare data, business processes and technology, and solution integration and implementation. Contact us today, and let’s discuss your specific needs and goals.

]]>
https://blogs.perficient.com/2022/06/29/a-mdm-success-story-streamlining-claims-processing-and-payments-with-reliable-centralized-data/feed/ 2 311874