Data & Intelligence

Bringing Informatica Intelligent Cloud Service into your Release Management Pipeline

Istock 927720230 Featured Image

Informatica Intelligent Cloud Services (IICS) now offers a free command line utility that can be used to integrate your ETL jobs into most enterprise release management pipelines. It’s called the Asset Management command line interface (CLI). Version two now allows you to extract an IICS job into a single compressed file. Moving a single standalone artifact from Development through QA and Production using a script makes it easy to incorporate your ETL pipeline into common release management tools. This can have profound implications for the deployment stage of a complex enterprise release cycle. Consider a typical war room scenario in a major release and how we can use the new Informatica tools to improve the deployment procedure.

The War Room

In a war room (aka situation room or command center), key stakeholders gather together to focus on solving a specific, critical problem. A war room for a major enterprise release is typically focused on the coordinated deployment of artifacts such as application code, store and compute resources and database changes. Database changes refer to both structure changes (ex: table definitions) and data migration, which is where Extract Transform and Load (ETL) tools like Informatica come into play. The divide between the teams that have a CI/CD practice like the application development and infrastructure teams and the teams that rely on manual deployments like the database administrators (DBAs) and ETL teams become painfully clear. Using a common frame of reference before the Production deployment is the major source of disconnect. It makes sense to have an integrated release management system before deployment.

There are three concepts that make a good release management system; source code control, a pipeline to coordinate deployment of artifacts and a repository to store versioned instances of these artifacts. Git is the de facto version-control system for tracking changes in source code during software development. Git can also be used to track changes in any set of files. Jenkins is an automation server that executes pipelines related to building, testing and deploying code, usually code stored in a Git repo. Like Git, Jenkins can also be used to deploy any kind of file, even files that are not the subject of building and testing. Finally, a tool like Artifactory can be used to store the binaries that are the target of deployment. This is optional since you can configure your source code tool to store releases. We will not demo it here, but it is a nice tool. There are several approaches that try to insert Informatica at the beginning of the CI/CD process and emulate the same pipeline as a software development project. But Informatica is not a custom Java application, and doesn’t need to be. Invert the process and start from release management.

Build an Informatica release pipeline

To build an Informatica release pipeline, we are going to start by exporting an Informatica project as a single file. This is the key differentiator between using Informatica as a standalone application and integrating Informatica into your enterprise’s release pipeline. For the sake of this blog, we are going to assume that your organization uses GitHub for source control and Jenkins for orchestration. The open source tools are popular enough that you will easily be able to find the equivalent instructions for a different application. I deployed this example using Amazon Web Services but this will work on any public or private cloud.

Setup GitHub

Create a new repository in GitHub to store the sample Informatica project and deploy to Jenkins.

  1. Navigate to GitHub and sign in.
  2. In Your repositories, choose New repository.
  3. On the navigation bar, choose Create new (+), and then choose New repository.
  4. In the Create a new repository page, do the following:
  5. In the Repository name box, enter IICSDemo.
  6. Select Public.
  7. Clear the Initialize this repository with a README check box.
  8. Choose Create repository.

Go to AWS (or another cloud provider) and create a free, barebones Linux-based computer instance. Connect to the new instance with a terminal and execute the following:

mkdir ~/workspace/default && cd "$_"
touch README.md
git init
git add README.md
git commit -m "Init IICS demo"
git remote add origin https://github.com/user-name/IICSDemo.git git push -u origin master

You now have a working github repository.  In the next step, we will setup Jenkins.

Setup Jenkins

Clone the ec2 instance you created for the IICS CLI and connect to the new instance in another terminal window and execute the following:

sudo yum install docker
sudo usermod -a -G docker ec2-user
newgrp docker
sudo systemctl enable docker
sudo systemctl start docker
sudo systemctl status docker

docker run \
-u root \
--rm \
-d \
-p 8080:8080 \
-p 50000:50000 \
--name jenkins_iics \
-v jenkins-data:/var/jenkins_home \
-v /var/run/docker.sock:/var/run/docker.sock \
jenkinsci/blueocean

# get the admin password
docker exec -it jenkins_iics bash
cat /var/jenkins_home/secrets/initialAdminPassword
exit
Data Intelligence - The Future of Big Data
The Future of Big Data

With some guidance, you can craft a data platform that is right for your organization’s needs and gets the most return from your data capital.

Get the Guide

You use the admin password to log into Jenkins, which will be running on the public url of your EC2 instance on port 8080. Create a new user so you don’t log in with Admin all of the time.

From Jenkins

  1. Click New Item
  2. Enter ‘IICS Default’ as the item name
  3. Select Freestyle Project
  4. Click OK
  5. Scroll down to the Build section
    1. Select ‘Execute script’ from the ‘Add Build Step’ drop down.
    2. In the command box, type echo “Hello, World’
  6. You can now test the pipeline by clicking Build Now.
    1. Once the build is complete, click on the Console Output to verify the Hello World message was printed

Being able to execute a script from Jenkins demonstrates that you can integrate with almost any build pipeline. You could script executing processes, moving files, etc. Next, we will integrate Jenkins into GitHub.

Integrate GitHub and Jenkins

You will copying and pasting information from Jenkins to GitHub and vice versa, so have both tabs open.

  1. In Jenkins,
    1. scroll down to the Source Code Management section
    2. Select Git
  2. From Github repo,
    1. Cick on the Clone with HTTPS button and paste the url into the Repository URL text box in Jenkins
    2. Next, connect Jenkins to Github’s checkin events
      1. Navigate to Setting and click WebHooks
      2. Add the Jenkins url and append /jenkins/github-webhook (ex http://11.111.111.111/jenkins/github-webhook/)
      3. Select application/json as the content type
      4. We just want the push event.
      5. Click Add Webhook

At this point, we have a README file checked into GitHub and Jenkins running a simple Hello World script. In the next step, we will extract the Default project from IICS into the git repo, which will trigger a Jenkins build.

Setup IICS cli

Sign up for Informatica Intelligent Cloud Service training and get access to a free instance for thirty days. This first part of the url is your pod name (ex: na1.dm-us.informaticacloud.com). Note your region and make sure you save your user name and password. Execute the following commands from the same terminal instance where you configured git.

mkdir ~/tools && cd "$_"
get https://github.com/InformaticaCloudApplicationIntegration/Tools/raw/master/IICS%20Asset%20Management%20CLI/v2/linux-x86_64/iics
chmod 775 iics
./iics version

You now have an executable instance of the IICS Asset Management command line interface and it should be version two. Let’s see if you can connect to your IICS instance.

./iics list -region us --podHostName na1.dm-us.informaticacloud.com --username xxxxx.xxxxx@xxxxx.xxx --password xxxxxxx

You should see something like

Explore/Default.Project
Explore/Default/Mapping1.DTEMPLATE

Now, export a project into a working directory.

cd ~/tools
./iics export -region us --podHostName na1.dm-us.informaticacloud.com --username xxxxx.xxxxx@xxxxx.xxx --password xxxxxxx --zipFilePath ~/workspace/default/default.zip --logLevel info --artifacts Explore/Default.Project

You should now have a zip file called default.zip in the default directory of your workspace. Check it in.

cd ~/workspace/default
git add default.zip 
git commit -m "Push first export from IICS to GitHub" 
git push -u origin master

Go to GitHub and confirm that you have a new file in your repository. Go to Jenkins and confirm that a new build was triggered on the commit. You have just extracted an Informatica job as a single artifact, checked it into GitHub and executed a Jenkins build. If you build this process into your normal routine, you will find it easier to integrate into larger enterprise projects in the future. In the next war room, you’ll all be speaking the same language.

Summary

This is not the fail-fast model usually enabled by Continuous Integration/Continuous Release cycles. ETL project typically do not fall into this category. IICS usually moves key business information from an on-premise source system to a cloud-based data warehousing solution. These ETL project enable fail-fast projects, but they themselves are typically the subject of deep controls. However, you can integrate your Informatica Intelligent Cloud Services jobs into a standard deployment pipeline by using the new Asset Management command line interface (CLI) . 

About the Author

As a solutions architect with Perficient, I bring twenty years of development experience and I'm currently hands-on with Hadoop/Spark, blockchain and cloud, coding in Java, Scala and Go. I'm certified in and work extensively with Hadoop, Cassandra, Spark, AWS, MongoDB and Pentaho. Most recently, I've been bringing integrated blockchain (particularly Hyperledger and Ethereum) and big data solutions to the cloud with an emphasis on integrating Modern Data produces such as HBase, Cassandra and Neo4J as the off-blockchain repository.

More from this Author

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Subscribe to the Weekly Blog Digest:

Sign Up