MongoDB is an open-source non structured query language database that provides high performance, high availability and automatic scaling.
In this blog, let’s see how to load data into MongoDB through the IBM Datastage tool by using the Java Integration stage.
Pre-Requisites:
- Install Eclipse tool
- MongoDB requires a java jar file with the below code
- Jar file should contain client classes
- ibm.is.cc.javastage.nosql.MongoStage – provides the main code for the implementation.
- ibm.is.cc.javastage.nosql.MongoImport – provides the import functionality invoked from the ‘Generate’ button in the Java Integration stage.
- ibm.is.cc.javastage.nosql.FieldMetadata –represents the metadata of a field discovered in a MongoDB document.
- ibm.is.cc.javastage.nosql.BSONSerializer – implements a serializer to produce JSON from MongoDB documents.
- User class must be set with a package name (com.ibm.is.cc.javastage.nosql)
- MongoDB stores objects as BSON documents.
- MongoDB Java API returns these as hierarchical objects containing fields and lists
Here, I have used the eclipse tool to create the jar file and using “ant build” for the compilation in the root directory. This will create a jar file in the jar subdirectory called Mongostage.jar. The ant file contains a property that should be modified based on your environment. Below is the code for property file based on my environment.
<?xml version=”1.0″ encoding=”UTF-8″ ?>
<!– //*************************************************************************** // (c) Copyright IBM Corp. 2012 All rights reserved. // The following sample of source code (“build.xml”) is owned by International // Business Machines Corporation or one of its subsidiaries (“IBM”) and is // copyrighted and licensed, not sold. You may use, copy, modify, and // distribute the Sample in any form without payment to IBM, for the purpose of // assisting you in the development of your applications. // // The Sample code is provided to you on an “AS IS” basis, without warranty of // any kind. IBM HEREBY EXPRESSLY DISCLAIMS ALL WARRANTIES, EITHER EXPRESS OR // IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF >
<project name=”MongoDB sample build file” default=”build” basedir=”.”>
<!– Modify this to point to your MongoDB install directory –> <property name=”mongo.dir” value=”../lib”/>
<!– Modify this to match the name of your MongoDB jar file –> <property name=”mongo.jar” value=”${mongo.dir}/mongodb-driver-3.4.2.jar”/>
<property environment=”env”/> <property name=”is-home” value=”C:/IBM/InformationServer”/> — Home Path <property name=”classes” value=”${basedir}/classes”/>– Class path <property name=”jars” value=”${basedir}/jars”/>– Jar path <property name=”api.jar” value=”${mongo.dir}/ccjava-api.jar”/>–API Path <property name=”bson.jar” value=”${mongo.dir}/bson-3.0.1.jar”/>– Binary Json path <property name=”mongo.java.jar” value=”${mongo.dir}/mongo-java-driver-3.4.2.jar”/> <property name=”user.jar” value=”${jars}/MongoStage.jar”/>–Main jar path and name
<path id=”build.classpath”> <pathelement location=”${api.jar}”/> <pathelement location=”${classes}”/> <pathelement location=”${mongo.jar}”/> <pathelement location=”${bson.jar}”/> <pathelement location=”${mongo.java.jar}”/> </path>
<target name=”build”> <mkdir dir=”${jars}”/> <mkdir dir=”${classes}”/> <javac srcdir=”${basedir}” destdir=”${classes}” classpathref=”build.classpath” debug=”true” deprecation=”true” optimize=”false”> </javac> <jar jarfile=”${user.jar}”> <fileset dir=”${classes}”> <include name=”com/ibm/is/cc/javastage/nosql/**/*.class”/> </fileset> </jar> </target>
<target name=”clean”> <delete quiet=”true” dir=”${classes}”/> <delete quiet=”true” dir=”${jars}”/> </target> </project>
|
With the above process, we can create the jar file and now let’s see how it we can implement this using Datastage jobs.
Step 1:
ZipCode text file is used as the source which we are going to load into the target Mongo Database
We can see the path given for the file and data of the file in the screenshot below.
Step 2:
To import the metadata, click the configure button in the Java Integration stage.
The below screenshots depict the process of importing metadata. Any changes made to Custom Property Editor will reflect in the Custom properties in the properties tab of the Java Integration stage.
The below snapshot depicts the current metadata for the input link. After clicking the ‘Browse objects’ button, it opens child dialog.
The MongoDB sample.jar file connects to the MongoDB source and queries the metadata of the collection provided in the custom properties. It achieves this by querying the collection and examining the documents that are returned. The results are then displayed in the child dialog. Clicking on the root node of the tree will select all columns to import.
Saying OK in the above dialog, dismisses it and populates the main dialog with the column definitions. The import action allows the user to select whether to overwrite the existing or the column definition for the link.
Clicking Finish will result in the column definitions being modified on the link, and the custom properties being saved.