Data & Intelligence

SCD Type 1 using tAddCRCRow component

Leverage Data Analytics In Software Development To Grow Business Value

When we approach a scenario related to database, commonly used SCD types. Even though Talend has an inbuilt component we mostly prefer not to use tDBSCD feature to improve performance. The common way to implement SCD is to normalize a job design instead of using a single component. Irrespective of technology people look for an effortless approach or straightforward methodology to implement any model/Job. Let’s implement the painless design in a minutes.

Introducing a component tAddCRCRow to achieve hashkey which helps to track the data to identify the changes. This component has an inbuilt column (CRC) which will create a hashkey based on the selected non-key columns.

B1

Let’s take an example to understand better, consider the banking domain source as csv file and target as table. Here we are going to handle upsert by splitting the data flow based on target lookup.

Job Design

Data Intelligence - The Future of Big Data
The Future of Big Data

With some guidance, you can craft a data platform that is right for your organization’s needs and gets the most return from your data capital.

Get the Guide

B2

We can place the CRC component before comparing source and target data to generate the hashkey. In this component ,enable the checkbox (Use in CRC) of non key columns that creates the hash value in CRC column for every single record coming from source.

B3

And then we are segregating the data using inner join by matching the key columns. One flow is to capture the rejects by enabling “catch lookup inner join reject” as true and insert into target table.

B4

Another flow is to update the matched records and filter the changed records by comparing the Key column and CRC column. This filter is to avoid the unwanted update of non-changed records in this flow. Changes are  updated in target table with new hash value.

B5

Point to remember :- Need to define hashkey and audit columns in target table.

B6This approach will be more effective when we compare more number of non-key columns.

About the Author

Rajeshwari Ramamoorthy is a senior technical consultant at Perficient. She is an expert in the Talend ETL tool.

More from this Author

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.