This blog will help you to understand the basic functionality of Azure Data Factory (ADF) and how powerful a tool it is when working with big data. Explore the basic architecture on ADF and get to know the components and services involved. A Quick Intro to Azure Data Factory & Its Key Features ADF is […]
Posts Tagged ‘ETL’
Expand your search using AWS native services to identify, comprehend and securely store documents.
The document debacle Companies continue to fight the battle of the age-old problem: paper documents. Adapting to document modernization to expand the ability to search, catalog and protect HIPAA\PII data is paramount. Perficient continues to help businesses speed up the time it takes to digitize documents for further integration in other sectors. In this article, […]
Data Architecture: 2.5 Types of Modern Data Integration Tools
As we move into the modern cloud data architecture era, enterprises are deploying 2 primary classes of data integration tools to handle the traditional ETL and ELT use cases. The first type of Data integration tool is GUI-Based Data Integration solutions. Talend, Infosphere Datastage, Informatica, and Matillion are good examples. These tools leverage a UI […]
Join Us at MicroStrategy World 2020
MicroStrategy World 2020 is about a month away, happening February 4-6 in Orlando, FL. Sunny, warm weather and the latest MicroStrategy releases will make for an awesome and exciting, education-packed week. Perficient is proud to be a Silver sponsor of the event this year. Our experts look forward to meeting you in the expo hall […]
Migrating AEM Content with Groovy
Migrating content into AEM is nobody’s idea if fun. Creating experiences and authoring content in the powerful AEM authoring experience is great, but identifying, classifying and mapping legacy content? Not so much. AEM’s repository structure contributes to this challenge. AEM, being based on the Java Content Repository (JCR) offers a massively more flexible content taxonomy […]
Integrate Your Data using Oracle Data Integration Platform Cloud
Oracle Data Integration Platform Cloud (DIPC) DIPC is a unified, powerful, data-driven data integration platform on cloud which can accept data in any format from any source system either on premise or on cloud and process that data as per organization needs. With DIPC, you get all the capabilities of most popular E-LT (Extract – Load […]
Oracle BI Data Sync: How to Add a New Fact
Following my previous blog post on how to add a new Dimension to a Data Sync task, this post looks at how to add a Fact and perform a lookup on dimensions while loading the target fact table in a data warehouse using Data Sync. To refer to the blog post on adding a Dimension […]
Oracle BI Data Sync: How to Add a New Dimension
In this and the following post, I will cover the steps entailed in adding dimension and fact tasks in Oracle Data Sync. The latest releases of Data Sync included a few important features such as performing look-ups during an ETL job. So I intend to cover these best practices when adding new dimension and fact […]
Introduction to Data Masking Transformation in Informatica
Introduction On a daily basis, data growth is expanding at a pace greater than the expansion of the universe itself. It makes our lives better, but it also has the capability of reflecting the vulnerabilities of a person or an organization. Data is like Infinity Gauntlet. If you know how to use it, like Thanos […]
Managing Huge Data Loads Using Bulk Load in Informatica
Everything is data and everyone is data! Everybody says we live in the technology age, the age of the internet, the age of space and cosmos, the age of the digital world etc. But all of these developments and advancements have been made possible by the Holy Grail called Data. We learned how to store […]
Deploying ETL Platforms with Jenkins and AWS CloudFormation at a Large Financial Institution
My focus at Perficient lately is a fast-moving project with 15+ individuals, complex requirements, and a tight deadline. The experience is extremely valuable. The project is an ETL platform on AWS that uses Lambda for event-driven processing, Elastic MapReduce (EMR) for managed Hadoop clusters, RDS and S3 for persistence, and a handful of other services. […]
Spark as ETL
Introduction: In general, the ETL (Extraction, Transformation and Loading) process is being implemented through ETL tools such as Datastage, Informatica, AbInitio, SSIS, and Talend to load data into the data warehouse. The same process can also be accomplished through programming such as Apache Spark to load the data into the database. Let’s see how it […]