How to use Data Sync (BICS) for Full Blown ETL Service

As Oracle BI Cloud Service (BICS) starts to become popular, I am going to write a series blogs about best use of the Data Sync to upload data to BICS schema service. In this post, I am going to share with you a best practice to use Data Sync for ETL.

As an ETL tool, Data Sync does have its strength which is

Very simple to install, configure, and run data mappings
Written in Java, runs on any platform that has a Java VM
Able to connect to any database that has a JDBC driver
Built in connection support for most commonly used databases

But Data Sync has its own limitations

No advanced data transformation capabilities as that seen in Informatics and ODI
Custom SQL query (SQL override) doesn’t work with flat files and data across multiple data sources
Only SQL Server and Oracle databases are allowed for data writing

Build an AI-First Enterprise

From early pilots to enterprise-wide deployment, our award-winning AI consulting and technical services help you build the right foundation, scale responsibly, and deliver meaningful business outcomes.

Learn More

If source data are well defined and structured, Data Sync will work perfectly to extract and upload the source data to BICS schema service. If data requires transformation, such as data lookup across multiple datasources, it seems pretty helpless. Although it could be achieved by writing PL/SQLs in Apex BICS schema service, but it’s not a best practice to have the data transformation logic scattered, part in Data Sync as custom SQL query, and part in Apex as PL/SQLs.

A best practice is to introduce a staging database, a common central area for Data Sync to first load all source data for necessary data transformation, then load the transformed data to the BICS schema service. This approach will remedy all the Data Sync’s limitations without loosing its strength, as long as the staging database is either SQL Server or Oracle.

Continue with the Configure Data Sync (BICS) to Load Staging Database for how to set up Data Sync to load staging database.

In my next blog post, I will talk about “3 Ways to Integrate Oracle ERP Cloud (OTBI) with Data Sync (BICS)“.

Thoughts on “Best Practice to ETL with Data Sync (BICS)”

Mohammad Jamal January 11, 2017 at 4:40 pm

One of the warnings I used to repeatedly see in Data Sync documentation that Data sync tool stores sensitive information, including connection to your on-premises database. Recommendation was to make sure it is only installed in production in a highly controlled environment.

Questions:
1. Does it still contain connection info ?
2. What other sensitive information does the tool contain ?
3. If #1 and #2 is true, then have such sensitive information in the tool affected the acceptance of the tool ?
Daniel Xiao January 11, 2017 at 10:06 pm

Although Data Sync encrypts and stores the sensitive connection info in a file-based credential store – cwallet.sso – the same type of credential store is used by Weblogic, it should always be a best practice to lock down the access to the Data Sync install regardless the connected database is on-premsis or DBaaS.
Mohammad Jamal January 12, 2017 at 1:33 pm

Thanks Daniel. I do have a related question.

Why would you use one or the other if the choices were Oracle Data Integrator and Data Sync ?

Can you use ODI instead of Data Sync to load data from on-premises to BICS schema as a service via REST APIs ?

Looks like we can load into database as service using ODI. See here:
http://www.ateam-oracle.com/integrating-oracle-data-integrator-odi-on-premise-with-cloud-services/
Daniel Xiao Post author January 13, 2017 at 9:38 am

Data Sync is provided as a lightweight data tool, extremely easy to install and use, almost with zero maintenance, but is designed to do one thing specifically well that is to move data to Oracle BICS schema service and DBaaS. This makes Data Sync support a numbered source and target databases, and the framework not to be easily extendable.

ODI is a full-blown ELT tool, with all the bells and whistles that can be seen in an off-shelf ETL tool. New data transformation can be easily written and run as inline SQL scripts or Java codes. Its Knowledge Module (KM) framework is extremely extendable, custom KMs can always be built to support any types of data source such as social media, makes it future proof. It’s designed to work with any system and perform data integration in a more complex environment.

For a corporation already using ODI for its data integration work, it should leverage the same ODI skills for its BICS data integration. Otherwise, Data Sync should still be the best way to go due to its simplicity.
Mohammad Jamal January 13, 2017 at 11:13 am

Thanks again Daniel.

So in Data Sync we can do the following, please confirm :

1. Do incremental loads, meaning, only put into staging records that have changed in the source since the last load.

2. Then, when moving from staging to target, if the record exists in target, update the target record else insert a new record in target. Can use primary key to determine if record exists or not. The target in this case will be db in the cloud. Staging db may be on premises.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Best Practice to ETL with Data Sync (BICS)

by Daniel Xiao on January 9th, 2017 | ~ minute read

Build an AI-First Enterprise

Tags

Thoughts on “Best Practice to ETL with Data Sync (BICS)”

Leave a Reply

Daniel Xiao

Categories

Follow Us