Skip to main content

Integration & IT Modernization

Why You Need Grid Configuration in Datastage

Prior to running your project on a grid system, you must ensure that your grid environment is configured.

 

Why Do You Need Grid Configuration in Datastage?

  • Grid computing will enhance the performance of the server through maximum utilization of compute nodes to one or more projects simultaneously.
  • Enables both grid distribution methods simultaneously
  • Allows you to assign jobs to specific servers in thegrid
  • Allows you to assign a parallel job to run across multiple  servers

Platforms you can use:
Redhat / SusE
AIX/Power

Why is the Data integration Grid Driving Rapid Customer Adoption?

You can make better decisions when you have better data yields.

  • Grid-based integration makes it possible for companies to process and analyze larger data volumes, create a consolidated view of data, and put the right data into the enterprise data warehouse and other critical enterprise applications
  • More sources of data, more data from each source, better matching, real-time versus batch
  • Better business decisions
  • Enhanced customer relationships
  • More cross selling and upselling
  • New services delivered to customers

Reduced Data Integration Costs.

  • Reduce administration and operating cost –centralization of staff.
  • Reduced data integration project costs – lower cost per project delivered by data integration center of excellence versus siloed projects.
  • Reduced hardware cost.

What are the Benefits of Grid Computing?

  • Low cost hardware
  • High-throughput processing
  • Resource manager monitors availability of hardware at startup / job deployment time
  • SLA (Service Level Agreement) – It have consistent run times and isolates job concurrent execution.

 

Comparison of Before and After Grid Configuration

Before Grid:

before

Architecture & proliferation of SMP servers:
• Higher capital costs through limited pooling of IT assets across silos
• Higher operational costs
• Limited responsiveness due to more manual scheduling and provisioning
• Inherently more vulnerable to failure
• No ability to exploit available capacity when other teams are idle

After Grid:

after

“Virtualized” infrastructure:
• Creates a virtual data integration collaboration environment
• Virtualizes application services execution
• Dynamically fulfills requests over a virtual pool of system resources (nodes)
• Offers an adaptive, self-managed operating environment that guarantees high availability
• Delivers maximum available capacity to anyone participating in the grid

Grid Environmental Variable:
APT_GRID_ENABLE
• YES: Current osh will intercept the run script to create a new configuration file
• NO: Use the existing configuration file
APT_GRID_QUEUE
• Name of the Resource Manager queue the job will be submitted to
APT_GRID_COMPUTE_NODES
• The number of compute nodes required for the job
• Used to request the number of compute nodes in the dynamically created configuration file
• A compute node is a server that can be used for processing
• Not e.g. dedicated for IO or DB2
• Default value is 1
APT_GRID_PARTITIONS
• Used to create multiple partitions for each compute node • Default value is 1
Resource Management
• Tracks resources (nodes) based on which jobs are already running, which servers are down
• Queues jobs when no resources are available
• Provides a list of nodes that are assigned for a job
• Extensive advanced features
• We leverage a subset of the features
• Manager node where tasks are scheduled and resources allocated
• Usually happens on the head node
• Compute nodes have agent processes that communicate back to the manager
• Jobs (scripts or executables) are started on compute node, not head node
Grid Enable Tool kit:
What does it do?
• Prebuilt integration with resource managers
• Coordinates activities between the parallel framework and the resource manager
• Creates the parallel configuration file to drive the dynamic assignment of compute resources
• Logging (interaction w/ RM, usage details)

dynamic

Workflow of GRID:

grid

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Jayanth Kaliappan

More from this Author

Follow Us