The post is the 2nd in the series which covers capabilities, use cases and product offerings for change data capture technology.
The critical capabilities which are expected in a CDC tool
- Selective data replication and synchronization is a capability to synchronize data between multiple databases. It usually supports high volume and mission critical scenarios. An example of such use cases would be in creating redundant data for mission critical data and keeping data in all the operational systems in sync.
- Volume data movement is an important capability which involves high volume data extraction and delivery. The capability is required for supporting Business Intelligence and Data warehouse, and data migration efforts of an organization.
- Message Oriented Middleware is an infrastructure supporting sending and receiving messages between distributed systems. MOM allows application modules to be distributed over heterogeneous platforms and reduces the complexity of developing applications that span multiple operating systems and network protocols. The data is encapsulated into messages that different applications can exchange in real time.
- Data federation is a technology that provides an organization with the ability to aggregate data from disparate sources in a virtual database so it can be used for business intelligence (BI) or other analysis. The virtual database created by data federation technology doesn’t contain the data. But, it contains information about the actual data and its location . The actual data is left untouched.
There are 5 major use cases for the CDC technology and they are detailed below.
- BI and Data Warehousing. The most useful use case of a CDC technology would be in the Data Warehouse and the BI world. The data is sourced from multiple sources and is extracted from multiple operational systems delivering an integrated data structure to provide analytics for the whole company.
- MDM solutions is used to support Master Data Management by removing duplicates, standardizing data, incorporating rules to eliminate incorrect data from entering the system in order to create an authoritative source of master data. Master data are the products, accounts and parties for which the business transactions are completed.CDC helps in maintaining data consistency and help MDM tools provide a unified data structure for some of the corporate entities. Data replication and synchronization functions are increasingly required in supporting MDM.
- Data Migration. Many organizations have multiple data migration projects and face large scale migration efforts at any given time. The usual practice of custom coding is dying away with more abstract capabilities these CDC tools provide. Many of the legacy application changes and consolidation efforts are being addressed by data migration as well.
- Data Consistency across applications is used to maintain consistency and have data redundancy in mission critical application data. The CDC tools help in maintaining data consistency between the different application systems relying on different database solutions. For example a new purchase order details needs to reflected in the billing and inventory systems to avoid confusion in data and helps in having more reliable and consistent data across the databases and applications.
- Data Sharing with Vendors, Partners and Customers. There are requirements and standards that companies need to maintain for running a business. Nowadays it is expected that all companies open up their data platform for its vendors and partners to give more insights and visibility towards the key operational data. Data integration tools might be helpful in these scenarios, which often consist of the same types of data access, transformation and movement components found in other common use cases.
The Future of Big Data
With some guidance, you can craft a data platform that is right for your organization’s needs and gets the most return from your data capital.
Current Product Offerings
The Product Rating from Gartner features all prominent products and market leaders based on the score calculated by the features and capabilities these tools offer. The chart gives a quick glance of the same.
Source : Gartner
Product Offerings and Key Players
- IBM – Information Server
- iWay Software – DataMigrator , Data Hub , Service Manager
- Informatica – Informatica Platform
- Microsoft – SQL Server 2012 – Integration Services.
- Oracle – GoldenGate and Oracle Data Integrator
- SAP – Data Services
- SAS – Dataflux’s Data Management Platform
- Talend – Talend Integration Suite
Conclusion We see that CDC has widespread usecases and capabilities in an organisations IT efforts. IT departments would be wise in adopting CDC technologies and recognizing the capabilities it provides and in providing enterprise wide guidelines for its use.