Informatica Articles / Blogs / Perficient

PWC-IDMC Migration Gaps

Rajesh Ranga Rao — Thu, 05 Jun 2025 05:26:54 +0000

In the age of technological advancements happening almost every minute, upgrading a business is essential to survive competition, offering a customer experience beyond expectations while deploying fewer resources to derive value from any process or business.

Platform upgrades, software upgrades, security upgrades, architectural enhancements, and so on are required to ensure stability, agility, and efficiency.

Customers prefer to move from Legacy systems to the Cloud due to the offerings it brings. From cost, monitoring, maintenance, operations, ease of use, and landscape, Cloud has transformed D&A businesses significantly over the last decade.

Movement from Informatica Powercenter to IDMC has been perceived as the need of the hour due to the humongous advantages it offers. Developers must understand both flavors to perform this code transition effectively.

This post explains the PWC vs IDMC CDI gaps from different perspectives.

Development
Data
Operations

Development

The difference in native datatypes can be observed in IDMC when importing Source, Target, or Lookup. Workaround as follows.,
- If any consistency is observed in IDMC mappings with Native Datatype/Precision/Scale, ensure that the Metadata Is Edited to keep them in sync between DDL and CDI mappings.
In CDI, taskflow workflow parameter values experience read and consumption issues. Workaround as follows.,
- A Dummy Mapping task has to be created where the list of Parameters/Variables needs to be defined for further consumption by tasks within the taskflows (Ex, Command task/Email task, etc)
- Make sure to limit the # of Dummy Mapping tasks during this process
- Best practice is to create 1 Dummy Mapping task for a folder to capture all the Parameters/Variables required for that entire folder.
- For Variables whose value needs to be persistent for the next taskflow run, make sure the Variable value is mapped to the Dummy Mapping task via an Assignment task. This Dummy mapping task would be used at the start and end of the task flow to ensure that the overall task flow processing is enabled for Incremental Data processing.
All mapping tasks/sessions in IDMC are reusable. They could be used in any task flow. If some Audit sessions are expected to run concurrently within other taskflows, ensure that the property “Allow the mapping task to be executed simultaneously” is enabled.
Sequence generator: Data overlap issues in CDI. Workaround as follows.,
- If a sequence generator is likely to be used in multiple sessions/workflows, it’s better to make it a reusable/SHARED Sequence.
VSAM Sources/Normalizer was not available in CDI. Workaround as follows.,
- Use the Sequential File connector type for mappings using Mainframe VSAM Sources/Normalizer.
Sessions are configured to have STOP ON ERRORS >0. Workaround as follows.,
- Ensure the LINK conditions for the next task to be “PreviousTask.TaskStatus – STARTS WITH ANY OF 1, 2” within CDI taskflows.
Partitions are not supported with Sources under Query mode. Workaround as follows.,
- Ensure multiple sessions are created and run in parallel as a workaround.
Currently, parameterization of Schema/Table is not possible for Mainframe DB2. Workaround as follows.,
- Use an ODBC-type connection to access DB2 with Schema/Table parameterization.
A mapping with a LOOKUP transformation used across two sessions cannot be overridden at the session or mapping task level to enable or disable caching. Workaround as follows.,
- Use 2 different mappings with LOOKUP transformations if 1 mapping/session has to have cache enabled and the other mapping/session has to have cache disabled.

Data

IDMC Output data containing additional Double quotes. Workaround as follows.,
- Session level – use this property – __PMOV_FFW_ESCAPE_QUOTE=No
- Administrator settings level – use this property – UseCustomSessionConfig = Yes
IDMC Output data containing additional Scale values with Decimal datatype (ex., 11.00). Workaround as follows.,
- Use IF-THEN-ELSE statement to remove Unwanted 0s in data (O/P : from 11.00 -> 11)

Operations

CDI doesn’t store logs beyond 1000 mapping tasks run in 3 days on Cloud (it does store logs in Secure Agent). Workaround as follows.,
- To retain Cloud job run stats, create Audit tables and use the Data Marketplace utility to get the Audit info (Volume processes, Start/End time, etc) loaded to the Audit tables by scheduling this job at regular intervals (Hourly or Daily).
Generic Restartability issues occur during IDMC Operations. Workaround as follows.,
- Ensure a Dummy assignment task is introduced whenever the code contains Custom error handling flow.
SKIP FAILED TASK and RESUME FROM NEXT TASK operations have issues in IDMC. Workaround as follows.,
- Ensure every LINK condition has an additional condition appended, “Mapping task. Fault.Detail.ErrorOutputDetail.TaskStatus=1”
In PWC, any task can be run from anywhere within a workflow; however, this is not possible in IDMC. Workaround as follows.
- Feature request worked upon by GCS to update the Software
IDMC mapping task config level is not capable due to parameter concatenation issues. Workaround as follows.,
- Ensure to use a separate parameter within the parameter file to have the Mapping task log file names suffixed with the Concurrent run workflow instance name.
IDMC doesn’t honour the “Save Session log for these runs” property set at the mapping task level when the session log file name is parameterized. Workaround as follows.,
- Ensure to copy the mapping task log files in the Secure agent server after the job run
If Session Log File Directory contains / (Slash) when used along with parameters (ex., $PMSessionLogDir/ABC) under Session Log Directory Path, this would append every run log to the same log file. Workaround as follows.,
- Ensure to use a separate parameter within the parameter file for $PMSessionLogDir
In IDMC, the @numAppliedRows and @numAffectedRows features are not available to get the source and target success rows to load them in the audit table. Workaround as follows.,
- @numAppliedRows is used instead of @numAffectedRows
Concurrent runs cannot be performed on taskflows from the CDI Data Integration UI. Workaround as follows.,
- Use the Paramset utility to upload concurrent paramsets and use the runAJobCli utility to run taskflows with multiple concurrent run instances from the command prompt.

Conclusion

While performing PWC to IDMC conversions, the following Development and Operations workarounds will help avoid rework and save effort, thereby achieving customer satisfaction in delivery.

IDMC – CDI Best Practices

Rajesh Ranga Rao — Thu, 05 Jun 2025 05:01:33 +0000

Every end product must meet and exceed customer expectations. For a successful delivery, it is not just about doing what matters, but also about how it is done by following and implementing the desired standards.

This post outlines the best practices to consider with IDMC CDI ETL during the following phases.

Development
Operations

Development Best Practices

Native Datatypes check between Database table DDLs and IDMC CDI Mapping Source, Target, and Lookup objects.
- If any consistency is observed in IDMC mappings with Native Datatype/Precision/Scale, ensure that the Metadata Is Edited to keep them in sync between DDL and CDI mappings.
In CDI, workflow parameter values in order to be consumed by the taskflows, a Dummy Mapping task has to be created where the list of Parameters/Variables needs to be defined for further consumption by tasks within the taskflows (Ex, Command task/Email task, etc)
- Make sure to limit the # of Dummy Mapping tasks during this process
- Best practice is to create 1 Dummy Mapping task for a folder to capture all the Parameters/Variables required for that entire folder.
- For Variables whose value needs to be persistent for the next taskflow run, make sure the Variable value is mapped to the Dummy Mapping task via an Assignment task. This Dummy mapping task would be used at the start and end of the task flow to ensure that the overall task flow processing is enabled for Incremental Data processing.
If some Audit sessions are expected to run concurrently within other taskflows, ensure that the property “Allow the mapping task to be executed simultaneously” is enabled.
Avoid using the SUSPEND TASKFLOW option, as it requires manual intervention during job restarts. Additionally, this property may cause issues during job restarts.
Ensure correct parameter representation using Single Dollar/Double Dollar. Incorrect representation will cause the parameters not to be read by CDI during Job runs.
While working with Flatfiles in CDI mappings, always enable the property “Retain existing fields at runtime”.
If a sequence generator is likely to be used in multiple sessions/workflows, it’s better to make it a reusable/SHARED Sequence.
Use the Sequential File connector type for mappings using Mainframe VSAM Sources/Normalizer.
If a session is configured to have STOP ON ERRORS >0, ensure the LINK conditions for the next task to be “PreviousTask.TaskStatus – STARTS WITH ANY OF 1, 2” within CDI taskflows.
For mapping task failure flows, set the LINK conditions for the next task to be “PreviousTask.Fault.Detail.ErrorOutputDetail.TaskStatus – STARTS WITH ANY OF 1, 2” within CDI taskflows.
Partitions are not supported with Sources under Query mode. Ensure multiple sessions are created and run in parallel as a workaround.
Currently, parameterization of Schema/Table is not possible for Mainframe DB2. Use an ODBC-type connection to access DB2 with Schema/Table parameterization.

Operations Best Practices

Use Verbose data Session log config only if absolutely required, and then only in the lower environment.
Ensure the Sessions pick the parameter values properly during job execution
- This can be verified by changing the parameter names and values to incorrect values and determining if the job fails during execution. If the job fails, it means that the parameters are READ correctly by the CDI sessions.
Ensure the Taskflow name and API name always match. If different, the job will face issues during execution via the runAJobCli utility from the command prompt.
CDI doesn’t store logs beyond 1000 mapping tasks run in 3 days on Cloud (it does store logs in Secure Agent). To retain Cloud job run stats, create Audit tables and use the Data Marketplace utility to get the Audit info (Volume processes, Start/End time, etc) loaded to the Audit tables by scheduling this job at regular intervals (Hourly or Daily).
In order to ensure no issues with Generic Restartability during Operations, ensure a Dummy assignment task is introduced whenever the code contains Custom error handling flow.
In order to facilitate SKIP FAILED TASK and RESUME FROM NEXT TASK operations, ensure every LINK condition has an additional condition appended, “Mapping task. Fault.Detail.ErrorOutputDetail.TaskStatus=1”
If mapping task log file names are to be suffixed with the Concurrent run workflow instance name, ensure it is done within the Parameter file. IDMC mapping task config level is not capable due to parameter concatenation issues.
Ensure to copy mapping task log files in the Secure agent server after job run, since IDMC doesn’t honour the “Save Session log for these runs” property set at the mapping task level when the session log file name is parameterized.
Ensure Session Log File Directory doesn’t contain / (Slash) when used along with parameters (ex., $PMSessionLogDir/ABC) under Session Log Directory Path. When used, this would append every run log to the same log file.
Concurrent runs cannot be performed on taskflows from the CDI Data Integration UI. Use the Paramset utility to upload concurrent paramsets and use the runAJobCli utility to run taskflows with multiple concurrent run instances from the command prompt.

Conclusion

In addition to coding best practices, following these Development and Operations best practices will help avoid rework and save efforts, thereby achieving customer satisfaction with the Delivery.

Salesforce + Informatica: What It Means for Data Cloud and Our Customers

Shanna Lueders — Fri, 30 May 2025 20:31:24 +0000

Salesforce’s acquisition of Informatica is a big move and a smart one. It brings together the world’s #1 AI CRM with a top name in enterprise cloud data management. Together, they’re taking on a challenge every business faces — turning scattered, messy data into something clean, connected, and ready to power AI.

What’s Next for Data Cloud

To understand why Salesforce’s acquisition of Informatica matters, it helps to look at where Data Cloud is headed.

Salesforce created Data Cloud to unify enterprise data and make it actionable in real time through tools like Agentforce. The goal is simple: help sellers, service agents, and AI-driven tools work smarter and faster. But as backend systems grow more complex, the need to harmonize scattered, inconsistent data has become a critical challenge.

Well-organized, reliable data fuels better decisions and stronger outcomes. That’s why enterprise leaders are prioritizing identity resolution, data quality, and multi-domain standardization like never before.

Turning Disconnected Records into Actionable Insights

Informatica brings the structure and scalability that enhance what Data Cloud already offers. With deep expertise in Master Data Management (MDM), data governance, and metadata control, Informatica helps organizations organize their data more effectively and create consistent, connected views across the enterprise.

The MDM platform at the core of Informatica’s offering plays a key role in solving identity resolution, one of the most persistent challenges for Data Cloud customers. By linking customer records across disconnected systems, Informatica enables a complete Customer 360 view. It also extends that capability across domains, managing data for products, suppliers, and reference entities with the same clarity and control.

With powerful data cataloging, lineage tracking, and quality tools, Informatica ensures every record is accurate, traceable, and ready for AI-powered workflows. The result: better insights, faster decisions, and greater trust in every interaction powered by Data Cloud.

Driving the Future of Data with Confidence

The addition of Informatica to Salesforce is a meaningful step forward. It gives organizations the tools to move faster, clean up complex data, and get a clearer view of customers and operations across every part of the business: sales, service, products, suppliers, and beyond.

With over a decade of hands-on experience with Informatica, we’ve delivered transformative data solutions for leading organizations across industries. We’ve unified scattered records into precise customer profiles, built enterprise-wide governance strategies, and executed complex, cross-domain MDM implementations.

Now, as Informatica becomes a core part of the Salesforce ecosystem, we’re ready to help our customers hit the ground running. We’re not just prepared, we’re already in motion, accelerating adoption and unlocking the full potential of trusted, intelligent data.

Have questions or want to explore what this means for your organization? Contact us — we’re here to help you take the next step.

Informatica Intelligent Cloud Services (IICS) Cloud Data Integration (CDI) for PowerCenter Experts

Venkata Pedapati — Wed, 19 Feb 2025 05:56:08 +0000

Informatica Power Center professionals transitioning to Informatica Intelligent Cloud Services (IICS) Cloud Data Integration (CDI) will find both exciting opportunities and new challenges. While core data integration principles remain, IICS’s cloud-native architecture requires a shift in mindset. This article outlines key differences, migration strategies, and best practices for a smooth transition.

Core Differences Between Power Center and IICS CDI:

Architecture: Power Center is on-premise, while IICS CDI is a cloud-based iPaaS. Key architectural distinctions include:
- Agent-Based Processing: IICS uses Secure Agents as a bridge between on-premise and cloud sources.
- Cloud-Native Infrastructure: IICS leverages cloud elasticity for scalability, unlike Power Center’s server-based approach.
- Microservices: IICS offers modular, independently scalable services.
Development and UI: IICS uses a web-based UI, replacing Power Center’s thick client (Repository Manager, Designer, Workflow Manager, Monitor). IICS organizes objects into projects and folders (not repositories) and uses tasks, taskflows, and mappings (not workflows) for process execution.
Connectivity and Deployment: IICS offers native cloud connectivity to services like AWS, Azure, and Google Cloud. It supports hybrid deployments and enhanced parameterization.

Migration Strategies:

Assessment: Thoroughly review existing Power Center workflows, mappings, and transformations to understand dependencies and complexity.
Automated Tools: Leverage Informatica’s migration tools, such as the Power Center to IICS Migration Utility, to convert mappings.
Optimization: Rebuild or optimize mappings as needed, taking advantage of IICS capabilities.

Best Practices for IICS CDI:

Secure Agent Efficiency: Deploy Secure Agents near data sources for optimal performance and reduced latency.
Reusable Components: Utilize reusable mappings and templates for standardization.
Performance Monitoring: Use Operational Insights to track execution, identify bottlenecks, and optimize pipelines.
Security: Implement robust security measures, including role-based access, encryption, and data masking.

Conclusion:

IICS CDI offers Power Center users a modern, scalable, and efficient cloud-based data integration platform. While adapting to the new UI and development paradigm requires learning, the fundamental data integration principles remain. By understanding the architectural differences, using migration tools, and following best practices, Power Center professionals can successfully transition to IICS CDI and harness the power of cloud-based data integration.

Understanding In-Out and Input Parameters in IICS

Mohammed Salem — Fri, 24 Jan 2025 10:57:29 +0000

In Informatica Intelligent Cloud Services (IICS), In-Out and Input Parameters provide flexibility in managing dynamic values for your mappings. This allows you to avoid hard-coding values directly into the mapping and instead configure them externally through parameter files, ensuring ease of maintenance, especially in production environments. Below, we’ll walk through the concepts and how to use these parameters effectively in your IICS mappings.

In-Out Parameters

Similar to Mapping Variables in Informatica PowerCenter:In-Out parameters in IICS function similarly to mapping parameters or variables in Informatica PowerCenter. These parameters allow you to define values that can be used across the entire mapping and changed externally without altering the mapping itself.
Frequently Updating Values: In scenarios where a field value needs to be updated multiple times, such as a Product Discount that changes yearly, quarterly, or daily, In-Out parameters can save time and reduce errors. Instead of hard-coding the discount value in the mapping, you can define an In-Out parameter and store the value in a parameter file.
For Example – Product Discount: If the Product Discount changes yearly, quarterly, or daily, you can create an In-Out parameter in your IICS mapping to store the discount value. Instead of updating the mapping each time the discount value changes, you only need to update the value in the parameter file.
Changing Parameter Values: Whenever the discount value needs to be updated, simply change it in the parameter file. This eliminates the need to modify and redeploy the mapping itself, saving time and effort.
Creating an In-Out Parameter: You can create an In-Out parameter in the mapping by specifying the parameter name and its value in the parameter file.
Configuring the Parameter File Path: In the Mapping Configuration Task (MCT), you can download the parameter file template. Provide the path and filename of the parameter file, and you can see the In-Out parameter definition in the MCT.
Download the Parameter File Template: You can download the parameter file template directly from the MCT by clicking on “Download Parameter File Template.” After downloading, place the file in the specified directory.
Defining Parameter Values: In the parameter file, define the values for your parameters. For example, if you’re setting a Discount value, your file could look like this:#USE_SECTIONS[INFORMATICA].[INOUT_PARAM].[m_test]$Product_Discount=10[Global]
Creating Multiple Parameters: You can create as many parameters as needed, using standard data types in the In-Out Parameters section. Common real-world parameters might include values like Product Category, Model, etc.

Input Parameters:

Input parameters are primarily used for parameterizing Source and Target Connections or objects. Here’s how to use input parameters effectively:

Create the Mapping First: Start by designing your mapping logic, completing field mappings, and validating the mapping. Once the mapping is ready, configure the input parameters.
Parameterizing Source and Target Connections: When parameterizing connections, create parameters for the source and target connections in the mapping. This ensures flexibility, especially when you need to change connection details without modifying the mapping itself. To create the Input parameter, go to the Parameter panel, click on Input Parameter, and create the Source and Target Parameter connections. Select the type as Connection, and choose the appropriate connection type (e.g., Oracle, SQL Server, Salesforce) from the drop-down menu.
Overriding Parameters at Runtime: If you select the “Allow Parameters to be Overridden at Runtime” option, IICS will use the values defined in the parameter file, overriding any hard-coded values in the mapping. This ensures that the runtime environment is always in sync with the latest configuration.
Configuring Source and Target Connection Parameters: Specify the values for your source and target connection parameters in the parameter file, which will be used during runtime to establish connections.
For example:
#USE_SECTIONS
[INFORMATICA].[INOUT_PARAM].[m_test]$$Product_Discount=10$$SRC_Connection=$$TGT_Connection=[Global]

Conclusion

In-Out and Input Parameters in IICS offer a powerful way to create flexible, reusable, and easily configurable mappings. By parameterizing values like field values, Source and Target Connections, or Objects, you can maintain and update your mappings efficiently.

A Comprehensive Guide to IDMC Metadata Extraction in Table Format

Christon Ramesh Jason — Sun, 17 Nov 2024 00:00:27 +0000

Metadata Extraction: IDMC vs. PowerCenter

When we talk about metadata extraction, IDMC (Intelligent Data Management Cloud) can be trickier than PowerCenter. Let’s see why.
In PowerCenter, all metadata is stored in a local database. This setup lets us use SQL queries to get data quickly and easily. It’s simple and efficient.
In contrast, IDMC relies on the IICS Cloud Repository for metadata storage. This means we have to use APIs to get the data we need. While this method works well, it can be more complicated. The data comes back in JSON format. JSON is flexible, but it can be hard to read at first glance.
To make it easier to understand, we convert the JSON data into a table format. We use a tool called jq to help with this. jq allows us to change JSON data into CSV or table formats. This makes the data clearer and easier to analyze.

In this section, we will explore jq. jq is a command-line tool that helps you work with JSON data easily. It lets you parse, filter, and change JSON in a simple and clear way. With jq, you can quickly access specific parts of a JSON file, making it easier to work with large datasets. This tool is particularly useful for developers and data analysts who need to process JSON data from APIs or other sources, as it simplifies complex data structures into manageable formats.

For instance, if the requirement is to gather Succeeded Taskflow details, this involves two main processes. First, you’ll run the IICS APIs to gather the necessary data. Once you have that data, the next step is to execute a jq query to pull out the specific results. Let’s explore two methods in detail.

Extracting Metadata via Postman and jq:-

Step 1:

To begin, utilize the IICS APIs to extract the necessary data from the cloud repository. After successfully retrieving the data, ensure that you save the file in JSON format, which is ideal for structured data representation.

Step 2:
Construct a jq query to extract the specific details from the JSON file. This will allow you to filter and manipulate the data effectively.

Windows:-
(echo Taskflow_Name,Start_Time,End_Time & jq -r ".[] | [.assetName, .startTime, .endTime] | @csv" C:\Users\christon.rameshjason\Documents\Reference_Documents\POC.json) > C:\Users\christon.rameshjason\Documents\Reference_Documents\Final_results.csv

Linux:-
jq -r '["Taskflow_Name","Start_Time","End_Time"],(.[] | [.assetName, .startTime, .endTime]) | @csv' /opt/informatica/test/POC.json > /opt/informatica/test/Final_results.csv

Step 3:
To proceed, run the jq query in the Command Prompt or Terminal. Upon successful execution, the results will be saved in CSV file format, providing a structured way to analyze the data.

Extracting Metadata via Command Prompt and jq:-

Step 1:
Formulate a cURL command that utilizes IICS APIs to access metadata from the IICS Cloud repository. This command will allow you to access essential information stored in the cloud.

Windows and Linux:-
curl -s -L -X GET -u USER_NAME:PASSWORD "https:///active-bpel/services/tf/status?runStatus=Success" -H "Accept: application/json"

Step 2:
Develop a jq query along with cURL to extract the required details from the JSON file. This query will help you isolate the specific data points necessary for your project.

Windows:
(curl -s -L -X GET -u USER_NAME:PASSWORD "https:///active-bpel/services/tf/status?runStatus=Success" -H "Accept: application/json") | (echo Taskflow_Name,Start_Time,End_Time & jq -r ".[] | [.assetName, .startTime, .endTime] | @csv" C:\Users\christon.rameshjason\Documents\Reference_Documents\POC.json) > C:\Users\christon.rameshjason\Documents\Reference_Documents\Final_results.csv

Linux:
curl -s -L -X GET -u USER_NAME:PASSWORD "https:///active-bpel/services/tf/status?runStatus=Success" -H "Accept: application/json" | jq -r '["Taskflow_Name","Start_Time","End_Time"],(.[] | [.assetName, .startTime, .endTime]) | @csv' /opt/informatica/test/POC.json > /opt/informatica/test/Final_results.csv

Step 3:
Launch the Command Prompt and run the cURL command that includes the jq query. Upon running the query, the results will be saved in CSV format, which is widely used for data handling and can be easily imported into various applications for analysis.

Conclusion
To wrap up, the methods outlined for extracting workflow metadata from IDMC are designed to streamline your workflow, minimizing manual tasks and maximizing productivity. By automating these processes, you can dedicate more energy to strategic analysis rather than tedious data collection. If you need further details about IDMC APIs or jq queries, feel free to drop a comment below!

Reference Links:-

IICS Data Integration REST API – Monitoring taskflow status with the status resource API

jq Download Link – Jq_Download

A Step-by-Step Guide to Extracting Workflow Details for PC-IDMC Migration Without a PC Database

Christon Ramesh Jason — Fri, 08 Nov 2024 06:29:05 +0000

In the PC-IDMC conversion process, it can be challenging to gather detailed information about workflows. Specifically, we often need to determine:

The number of transformations used in each mapping.
The number of sessions utilized within the workflow.
Whether any parameters or variables are being employed in the mappings.
The count of reusable versus non-reusable sessions used in the workflow etc.

To obtain these details, we currently have to open each workflow individually, which is time-consuming. Alternatively, we could use complex queries to extract this information from the PowerCenter metadata in the database tables.

This section focuses on XQuery, a versatile language designed for querying and extracting information from XML files. When workflows are exported from the PowerCenter repository or Workflow Manager, the data is generated in XML format. By employing XQuery, we can effectively retrieve the specific details and data associated with the workflow from this XML file.

Step-by-Step Guide to Extracting Workflow Details Using XQuery: –

For instance, if the requirement is to retrieve all reusable and non-reusable sessions for a particular workflow or a set of workflows, we can utilize XQuery to extract this data efficiently.

Step 1:
Begin by exporting the workflows from either the PowerCenter Repository Manager or the Workflow Manager. You have the option to export multiple workflows together as one XML file, or you can export a single workflow and save it as an individual XML file.

Step 2:-
Develop the XQuery based on our specific requirements. In this case, we need to fetch all the reusable and non-reusable sessions from the workflows.

let $header := "Folder_Name,Workflow_Name,Session_Name,Mapping_Name"
let $dt := (let $data := 
    ((for $f in POWERMART/REPOSITORY/FOLDER
    let $fn:= data($f/@NAME)
    return
        for $w in $f/WORKFLOW
        let $wn:= data($w/@NAME)
        return
            for $s in $w/SESSION
            let $sn:= data($s/@NAME)
            let $mn:= data($s/@MAPPINGNAME)
            return
                
                    {
                        $fn ,
                        "," ,
                        $wn ,
                        "," ,
                        $sn ,
                        "," ,
                        $mn
                    }
                )
    |           
    (for $f in POWERMART/REPOSITORY/FOLDER
    let $fn:= data($f/@NAME)
    return          
        for $s in $f/SESSION
        let $sn:= data($s/@NAME)
        let $mn:= data($s/@MAPPINGNAME)
        return
            for $w in $f/WORKFLOW
            let $wn:= data($w/@NAME)
            let $wtn:= data($w/TASKINSTANCE/@TASKNAME)
            where $sn = $wtn
            return
                
                    {
                        $fn ,
                        "," ,
                        $wn ,
                        "," ,
                        $sn ,
                        "," ,
                        $mn
                    }
                ))
       for $test in $data
          return
            replace($test/text()," ",""))
      return
 string-join(($header,$dt), "
")

Step 3:
Select the necessary third-party tools to execute the XQuery or opt for online tools if preferred. For example, you can use BaseX, Altova XMLSpy, and others. In this instance, we are using Basex, which is an open-source tool.

Create a database in Basex to run the XQuery.

Step 4: Enter the created XQuery into the third-party tool or online tool to run it and retrieve the results.

Step 5:
Export the results in the necessary file extensions.

Conclusion:
These simple techniques allow you to extract workflow details effectively, aiding in the planning and early detection of complex manual conversion workflows. Many queries exist to fetch different kinds of data. If you need more XQueries, just leave a comment below!

Streams with Tasks in Snowflake

Bowiya SivaKumar — Tue, 29 Oct 2024 13:41:58 +0000

Snowflake’s Stream

Stream

Stream is a CHANGE DATA CAPTURE methodology in Snowflake; it records the DML changes made to tables, including (Insert/Update/delete). When a stream is created for a table, it will create a pair of hidden columns to track the metadata.

create or replace stream s_emp on table emp append_only=false;

I have two tables, emp and emp_hist. Emp is my source table, and emp_hist will be my target.

Now, I will insert a new row in my source table to capture the data in my stream.

Let’s see our stream result.

In the same way, I’m going to delete and update my source table.

I deleted one record and made an update in a row, but here in the stream, we could see two deleted actions.

The first delete action was for the row that I deleted, and the second one is for the row that I updated.
If the row is deleted from the source, the stream will capture the METADATA$ACTION as DELETE and METADATA@ISUPDATE as FALSE.
If the row is updated in the source, the stream will capture both the delete and insert actions, so it will capture the old row as delete and the updated row as insert.

Create a Merge Query to Store the Stream Data into the Final Table

I’m using the below merge query to capture the newly insert and updated record (SCD1) into my final table.

merge into emp_hist t1

using (select * from s_emp where not(METADATA$ACTION=’DELETE’ and METADATA$ISUPDATE=’TRUE’) ) t2

on t1.emp_id=t2.emp_id

when matched and t2.METADATA$ACTION=’DELETE’ and METADATA$ISUPDATE=’FALSE’ then delete

when matched and t2.METADATA$ACTION=’INSERT’ and METADATA$ISUPDATE=’TRUE’

then update set t1.emp_name=t2.emp_name, t1.location=t2.location

when not matched then

insert (emp_id,emp_name,location) values(t2.emp_id,t2.emp_name,t2.location);

Query for SCD2

BEGIN;

update empl_hist t1

set t1.emp_name=t2.emp_name , t1.location=t2.location,t1.end_date=current_timestamp :: timestamp_ntz

from (select emp_id,emp_name,location from s_empl where METADATA$ACTION=’DELETE’) t2

where t1.emp_id=t2.emp_id;

insert into empl_hist select t2.emp_id,t2.emp_name,t2.location,current_timestamp,NULL

from s_empl t2 where t2.METADATA$ACTION=’INSERT’;

commit;

Tasks

Tasks use user-defined functions to automate and schedule business processes. A single task can perform a simple to complex function in your data pipeline.

I have created a task for the above-mentioned merge query. Instead of running this query manually every time, we can create a task. Here, I have added a condition system$stream_has_data(’emp_s’) in my task creation. So, if data is available in the stream, then the task will run and load it to the target table, or else it will be skipped.

create task mytask warehouse=compute_wh

schedule=’1 minute’ when

system$stream_has_data(’emp_s’)

as merge into emp_hist t1

using (select * from emp_s where not(METADATA$ACTION=’DELETE’ and METADATA$ISUPDATE=’TRUE’) ) t2

on t1.emp_id=t2.emp_id

when matched and t2.METADATA$ACTION=’DELETE’ and METADATA$ISUPDATE=’FALSE’ then delete

when matched and t2.METADATA$ACTION=’INSERT’ and METADATA$ISUPDATE=’TRUE’

then update set t1.emp_name=t2.emp_name, t1.location=t2.location

when not matched then

insert (emp_id,emp_name,location) values(t2.emp_id,t2.emp_name,t2.location);

Data Governance in Banking and Financial Services – Importance, Tools and the Future

Amit Sonavane — Tue, 15 Oct 2024 22:21:33 +0000

Let’s talk about data governance in banking and financial services, one area I have loved working in and in various areas of it … where data isn’t just data, numbers aren’t just numbers … They’re sacred artifacts that need to be protected, documented, and, of course, regulated within an inch of their lives. It’s not exactly the most glamorous part of financial services, but without solid data governance, banks would be floating in a sea of disorganized, chaotic, and potentially disastrous data mismanagement. And when we’re talking about billions of dollars in transactions, we’re not playing around.

As Bob Seiner, a renowned data governance expert, puts it, “Data governance is like oxygen. You don’t notice it until it’s missing, and by then, it’s probably too late.” If that doesn’t send a chill down your spine, nothing will.

Why is Data Governance Such a Big Deal?

In the banking sector, data governance is more than just a compliance checkbox. It’s essential for survival. Banks process an astronomical amount of sensitive information daily—think trillions of transactions annually—and they need to manage that data efficiently and securely. According to the World Bank, the global financial industry processes over $5 trillion in transactions every day. That’s not the kind of volume you want slipping through the cracks.

Even a small data breach can cost banks upwards of $4.35 million on average, according to a 2022 IBM report. No one wants to be the bank that has to call its shareholders after that kind of financial disaster.

Data governance helps mitigate these risks by ensuring data is accurate, consistent, and compliant with regulations like GDPR, CCPA, and Basel III. These rules are about as fun as reading tax code, but they’re crucial in ensuring customer data is protected, privacy is maintained, and banks don’t end up with regulators breathing down their necks.

Tools of the Data Governance Trade

Let’s talk about the cavalry—the tools that keep all this data governance stuff from turning into a full-blown nightmare. Thankfully, in 2024, we’re spoiled with a variety of platforms designed specifically to handle this madness.

Collibra and Informatica
- Collibra and Informatica are heavyweights in the data governance world, offering comprehensive suites for data cataloging, stewardship, and governance. Financial services companies like AXA and ABN AMRO rely on these tools to handle everything from compliance workflows to data lineage mapping.
Alation and Talend
- Alation is known for its AI-powered data cataloging and governance capabilities, while Talend excels in data integration and governance. Companies like American Express have adopted Alation’s tools to streamline their data governance operations.

The Future of Data Governance in Banking

Looking forward, the financial sector’s reliance on robust data governance is only going to increase. With the rise of AI, machine learning, and real-time data analytics, banks will need to be even more diligent in how they manage and govern their data. A recent study from IDC suggests that by 2026, 70% of financial institutions will have formalized data governance frameworks in place. That’s up from around 50% today, meaning that the laggards are starting to realize that flying by the seat of their pants just won’t cut it anymore.

Jamie Dimon, CEO of JPMorgan Chase, emphasized the importance of data governance in a recent shareholder letter, stating, “Data is the lifeblood of our organization. Our ability to harness, protect, and leverage it effectively will determine our success in the coming decades.”

Climate risk models are the newest elephant in the room. As banks face pressure to account for environmental factors in their risk assessments, data governance plays a critical role in ensuring the accuracy and transparency of these models. According to S&P Global, nearly 60% of global banks will be embedding climate risk into their core business models by 2025.

In a world where data is king, and compliance is the watchful queen, banks are stuck playing by the rules whether they like it or not. Data governance tools are not just for keeping regulators happy, but they also give financial institutions the confidence to innovate, knowing that they’ve got their data house in order.

A recent survey by Deloitte found that 67% of banking executives believe that improving data governance is critical to their digital transformation efforts. This statistic underscores the growing recognition that effective data governance is not just about compliance, but also about enabling innovation and competitive advantage.

So, yeah… data governance might not be the flashiest part of banking, but it’s the foundation that holds everything together. And if there’s one thing we can agree on, it’s that nobody wants to be the bank that ends up on the evening news because they forgot to lock the vault—whether it’s the physical one or the digital one.

SNOWPIPE WITH AWS

Shaik Abdul Kalam — Tue, 08 Oct 2024 13:58:04 +0000

SNOWFLAKE’S SNOWPIPE

Snow pipe:

snow pipe is a one of the data loading strategies in snowflake , for continuous data loading, will create a snow pipe to load the data from any data source or storage or any cloud to snowflake tables, its an event trigger ideology whenever a file came to the source immediately it will trigger and notify to the particular external stage in snowflake and load the data to the table immediately

procedure of snow pipe:

S3 bucket setup for snow pipe:

Create a s3 bucket in AWS and a folder in that:

Creating an IAM policy

From the home dashboard, search for and select IAM.
From the left-hand navigation pane, select Account settings.
Under Security Token Service (STS) in the Endpoints list, find the Snowflake region where your account is located. If the STS status is inactive, move the toggle to Active.
From the left-hand navigation pane, select Policies.
Select Create Policy.
For Policy editor, select JSON.
Add a policy document that will allow Snowflake to access the S3 bucket and folder.

The following policy (in JSON format) provides Snowflake with the required permissions to load or unload data using a single bucket and folder path.

Copy and paste the text into the policy editor:

{

“Version”: “2012-10-17”,

“Statement”: [

{

“Effect”: “Allow”,

“Action”: [

“s3:GetObject”,

“s3:GetObjectVersion”

“Resource”: “arn:aws:s3::://*”

}

Note that AWS policies support a variety of different security use cases.
Select Next.
Enter a Policy name (for example, snowflake_integration)
Select Create policy.

Step 2: Create the IAM Role in AWS

To configure access permissions for Snowflake in the AWS Management Console, do the following:

From the left-hand navigation pane in the Identity and Access Management (IAM) Dashboard, select Roles.
Select Create role.
Select AWS account as the trusted entity type.
In the Account ID field, enter your own AWS account ID temporarily. Later, you modify the trust relationship and grant access to Snowflake.
Select the Require external ID option. An external ID is used to grant access to your AWS resources (such as S3 buckets) to a third party like Snowflake.

Enter a placeholder ID such as 0000. In a later step, you will modify the trust relationship for your IAM role and specify the external ID for your storage integration.

Select Next.
Select the policy you created in Step 1: Configure Access Permissions for the S3 Bucket(in this topic).
Select Next.
Enter a name and description for the role, then select Create role.

You have now created an IAM policy for a bucket, created an IAM role, and attached the policy to the role.

On the role summary page, locate and record the Role ARN value. In the next step, you will create a Snowflake integration that references this role.

Note

Snowflake caches the temporary credentials for a period that cannot exceed the 60 minute expiration time. If you revoke access from Snowflake, users might be able to list files and access data from the cloud storage location until the cache expires.

Step 3: Create a Cloud Storage Integration in Snowflake

A storage integration is a Snowflake object that stores a generated identity and access management (IAM) user for your S3 cloud storage, along with an optional set of allowed or blocked storage locations (i.e. buckets). Cloud provider administrators in your organization grant permissions on the storage locations to the generated user. This option allows users to avoid supplying credentials when creating stages or loading data.

CREATE or replace STORAGE INTEGRATION bowiya_inte

TYPE = EXTERNAL_STAGE

STORAGE_PROVIDER = ‘S3’

ENABLED = TRUE

STORAGE_AWS_ROLE_ARN = ‘arn:aws:iam::151364773749:role/manojrole’

STORAGE_ALLOWED_LOCATIONS = (‘s3://newbucket1.10/sample.csv’);

The following will create an integration that allows access to all buckets in the account.

Additional external stages that also use this integration can reference the allowed buckets and paths:

Step 4: Retrieve the AWS IAM User for your Snowflake Account

To retrieve the ARN for the IAM user that was created automatically for your Snowflake account, use:

desc integration bowiya_inte;

Step 5: Grant the IAM User Permissions to Access Bucket Objects

The following step-by-step instructions describe how to configure IAM access permissions for Snowflake in your AWS Management Console so that you can use a S3 bucket to load and unload data:

Log in to the AWS Management Console.
Select IAM.
From the left-hand navigation pane, select Roles.
Select the role you created
Select the Trust relationships tab.
Select Edit trust policy.
Modify the policy document with the DESC STORAGE INTEGRATION Policy document for IAM role

Step 6: CREATE A STAGE IN SNOWFLAKE :

A stage is an object where files can be stored temporarily from a local storage or cloud storage, using the stage we can load the data into tables.

CREATE or replace STAGE mystage

URL = ‘s3://newbucket1.10/sample.csv’

STORAGE_INTEGRATION = bowiya_inte;

Step 7: CREATE A SNOW PIPE IN SNOWFLAKE:

CREATE or replace PIPE mypipe

AUTO_INGEST = TRUE

COPY INTO table1

FROM @mystage

FILE_FORMAT = (type = ‘CSV’ SKIP_HEADER = 1);

Step 7: CREATE A EVENT NOTIFICATION IN S3:

Event notification will notify when an object is changed or added into the bucket.

In s3 go to properties and create one notification event.

STEP 8: Get the SQS queue id from your snowflake pipe.

Once the notification event is created the snow pipe will load the data whenever the file is added or changed in s3 bucket.

STEP 9: MONITOR THE SNOW PIPE STATUS.

NOTE: Snow pipe won’t load the same file again, because the SQS queue is reading the file name read the file and the metadata was captured. If we upload the same file again then the SQS queue will not get any notification so snow pipe can’t load the same file again.

Buckle up, we’re headed to Informatica World 2024!

Gracey Allison — Thu, 16 May 2024 21:46:16 +0000

On the Road to Vegas!

We have our passes and are ready to hit the road for Informatica World 2024! This year’s conference is hosted in the heart of Las Vegas at Mandalay Bay Resort and Casino, May 20-23. Informatica World is an annual event that brings together over 2,000 experts to network, collaborate, and strategize new use cases on the Informatica platform.

Informatica empowers customers to maximize their data and AI capabilities. When leveraged properly, the Informatica Intelligent Data Management Cloud (IDMC) will uncover a clear path to success.

Thoughts from Leadership:

“It is always a pleasure to attend a sales conference, but when it comes to Informatica World, the energy is contagious, and the insights gained are nothing short of transformative.”

Informatica Practice Director, Atul Mangla

“I am ecstatic to be apart of the group representing Perficient at Informatica World 2024 and am looking forward to connecting with new folks and delving deep into the ever-expanding capabilities Informatica has to offer.”

Portfolio Specialist, Kendall Reahm

The Attendees:

Our team is looking forward to submersing themselves in a full schedule of keynotes, summits, and breakouts to further hone in our expertise with the Informatica platform. Perficient will be represented by:

Santhosh Nair – Data Solutions GM/ AVP
Atul Mangla – Informatica Practice Director
Kendall Reahm – Portfolio Specialist
Scott Vines – Portfolio Specialist

The Perficient thought leaders would love to meet you during the event! Reach out and let us know if you are coming so we can collaborate on innovative solutions for you and your customers leveraging the power of Perficient and Informatica.

See you there!

As an award winning Platinum Enterprise Partner, our team helps businesses rapidly scale enterprise services and collaboration tools to create value for employees and customers. We offer a wide range of solutions tailored to the unique needs of each customer.

Learn more about the Perficient and Informatica practice here.

IICS Micro and Macro Services

Mohammed Salem — Fri, 26 Apr 2024 13:34:44 +0000

Macros in IICS

Informatica IICS: An expression macro is a useful technique for creating complex or repeating expressions in mappings. This makes it possible to perform computations over various fields or constants.

creating a collection of related expressions so that the same computation can be done on several input fields.

Steps to Use Macros:

Login into your informatica cloud account and open Data Integration microservice.
Now, create a new mapping by clicking on New from Navigation window and Select Mapping and click on create.
Select source and target Objects in Source and Target Transformations.
Now, Create an Expression Transformation in IICS mapping between Source and Target.
Click the “+” icon in Expression Transformation to create an input macro field. Then, choose “Input_Macro_Field” as the field type, as shown below.
Configure the port according to the requirements (that is, whether we wish to apply the same logic or condition to all fields or just a few specific fields) as indicated below after generating the input macros.
Create an additional field in the same way as before, but this time choose “Output_Macro_Field” as the field type for the output macro, choose the data type, and set the precision to “Max” in order to avoid data truncation, as shown below..
Configure your macro expression in the output macro.
For example, we had to apply the LTRIM RTIM function and set all blank values to null. However, attempting to validate this expression resulted in an error: ‘This expression cannot be validated because it uses macro input fields’ . So, avoid from clicking the Validate button.
Navigating to the target, you will see an additional incoming field from the expression “%Input_Macro%_out.” As shown below.
In Target Transformation, choose ‘Completely Parameterized‘ under field mapping, and then create a new parameter as indicated below.
Now save the mapping.
Create a Mapping Configuration Task (MCT) and Select Runtime Environment and click Next .
Map all the fields with the suffix “_out” in order to allow expression logic to be applied in expression macros as shown below.
Click “Finish” and run MCT to complete the mapping requirements.

Informatica Articles / Blogs / Perficient

PWC-IDMC Migration Gaps

Development

Data

Operations

Conclusion

IDMC – CDI Best Practices

Development Best Practices

Operations Best Practices

Conclusion

Salesforce + Informatica: What It Means for Data Cloud and Our Customers

What’s Next for Data Cloud

Turning Disconnected Records into Actionable Insights

Driving the Future of Data with Confidence

Informatica Intelligent Cloud Services (IICS) Cloud Data Integration (CDI) for PowerCenter Experts

Understanding In-Out and Input Parameters in IICS

A Comprehensive Guide to IDMC Metadata Extraction in Table Format

Metadata Extraction: IDMC vs. PowerCenter

Extracting Metadata via Postman and jq:-

Extracting Metadata via Command Prompt and jq:-

A Step-by-Step Guide to Extracting Workflow Details for PC-IDMC Migration Without a PC Database

Step-by-Step Guide to Extracting Workflow Details Using XQuery: –

Streams with Tasks in Snowflake

Snowflake’s Stream

Stream

Create a Merge Query to Store the Stream Data into the Final Table

Query for SCD2

Tasks

Data Governance in Banking and Financial Services – Importance, Tools and the Future

Why is Data Governance Such a Big Deal?

Tools of the Data Governance Trade

Collibra and Informatica

Alation and Talend

The Future of Data Governance in Banking

SNOWPIPE WITH AWS

Buckle up, we’re headed to Informatica World 2024!

On the Road to Vegas!

Thoughts from Leadership:

The Attendees:

IICS Micro and Macro Services

Macros in IICS