Data & Intelligence

MongoDB Integration in Datastage

MongoDB is an open-source non structured query language database that provides high performance, high availability and automatic scaling.

In this blog, let’s see how to load data into MongoDB through the IBM Datastage tool by using the Java Integration stage.

 

Pre-Requisites:

  • Install Eclipse tool
  • MongoDB requires a java jar file with the below code
  • Jar file should contain client classes
    • ibm.is.cc.javastage.nosql.MongoStage – provides the main code for the implementation.
    • ibm.is.cc.javastage.nosql.MongoImport – provides the import functionality invoked from the ‘Generate’ button in the Java Integration stage.
    • ibm.is.cc.javastage.nosql.FieldMetadata –represents the metadata of a field discovered in a MongoDB document.
    • ibm.is.cc.javastage.nosql.BSONSerializer – implements a serializer to produce JSON from MongoDB documents.
  • User class must be set with a package name (com.ibm.is.cc.javastage.nosql)
  • MongoDB stores objects as BSON documents.
  • MongoDB Java API returns these as hierarchical objects containing fields and lists

Here, I have used the eclipse tool to create the jar file and using “ant build” for the compilation in the root directory. This will create a jar file in the jar subdirectory called Mongostage.jar. The ant file contains a property that should be modified based on your environment. Below is the code for property file based on my environment.

 

<?xml version=”1.0″ encoding=”UTF-8″ ?>

<!–

//***************************************************************************

// (c) Copyright IBM Corp. 2012 All rights reserved.

// The following sample of source code (“build.xml”) is owned by International

// Business Machines Corporation or one of its subsidiaries (“IBM”) and is

// copyrighted and licensed, not sold. You may use, copy, modify, and

// distribute the Sample in any form without payment to IBM, for the purpose of

// assisting you in the development of your applications.

//

// The Sample code is provided to you on an “AS IS” basis, without warranty of

// any kind. IBM HEREBY EXPRESSLY DISCLAIMS ALL WARRANTIES, EITHER EXPRESS OR

// IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF

>

 

<project name=”MongoDB sample build file” default=”build” basedir=”.”>

 

<!– Modify this to point to your MongoDB install directory –>

<property name=”mongo.dir” value=”../lib”/>

 

 

<!– Modify this to match the name of your MongoDB jar file –>

<property name=”mongo.jar” value=”${mongo.dir}/mongodb-driver-3.4.2.jar”/>

 

<property environment=”env”/>

<property name=”is-home” value=”C:/IBM/InformationServer”/> — Home Path

<property name=”classes” value=”${basedir}/classes”/>– Class path

<property name=”jars”    value=”${basedir}/jars”/>– Jar path

<property name=”api.jar” value=”${mongo.dir}/ccjava-api.jar”/>–API Path

<property name=”bson.jar” value=”${mongo.dir}/bson-3.0.1.jar”/>– Binary Json path

<property name=”mongo.java.jar” value=”${mongo.dir}/mongo-java-driver-3.4.2.jar”/>

<property name=”user.jar” value=”${jars}/MongoStage.jar”/>–Main jar path and name

 

<path id=”build.classpath”>

<pathelement location=”${api.jar}”/>

<pathelement location=”${classes}”/>

<pathelement location=”${mongo.jar}”/>

<pathelement location=”${bson.jar}”/>

<pathelement location=”${mongo.java.jar}”/>

</path>

 

<target name=”build”>

<mkdir dir=”${jars}”/>

<mkdir dir=”${classes}”/>

<javac srcdir=”${basedir}”

destdir=”${classes}”

classpathref=”build.classpath”

debug=”true”

deprecation=”true”

optimize=”false”>

</javac>

<jar jarfile=”${user.jar}”>

<fileset dir=”${classes}”>

<include name=”com/ibm/is/cc/javastage/nosql/**/*.class”/>

</fileset>

</jar>

</target>

 

<target name=”clean”>

<delete quiet=”true” dir=”${classes}”/>

<delete quiet=”true” dir=”${jars}”/>

</target>

</project>

 

MongoDB Property File

With the above process, we can create the jar file and now let’s see how it we can implement this using Datastage jobs.

Step 1:

ZipCode text file is used as the source which we are going to load into the target Mongo Database

Data Intelligence - The Future of Big Data
The Future of Big Data

With some guidance, you can craft a data platform that is right for your organization’s needs and gets the most return from your data capital.

Get the Guide

MongoDB ZipCode

We can see the path given for the file and data of the file in the screenshot below.

MongoDB ZipCode example

Step 2:

To import the metadata, click the configure button in the Java Integration stage.

Java Integration

The below screenshots depict the process of importing metadata. Any changes made to Custom Property Editor will reflect in the Custom properties in the properties tab of the Java Integration stage.

MongoDB Custom Property Editor

The below snapshot depicts the current metadata for the input link. After clicking the ‘Browse objects’ button, it opens child dialog.

Column Metadata Importer MongoDB

The MongoDB sample.jar file connects to the MongoDB source and queries the metadata of the collection provided in the custom properties. It achieves this by querying the collection and examining the documents that are returned. The results are then displayed in the child dialog. Clicking on the root node of the tree will select all columns to import.

Select Column Metadata MongoDB

Saying OK in the above dialog, dismisses it and populates the main dialog with the column definitions. The import action allows the user to select whether to overwrite the existing or the column definition for the link.

Column Metadata Importer MongoDB

Clicking Finish will result in the column definitions being modified on the link, and the custom properties being saved.

MongoDB

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Vinothkumar Sathiyamoorthi

More from this Author

Subscribe to the Weekly Blog Digest:

Sign Up
Follow Us
TwitterLinkedinFacebookYoutubeInstagram