As companies transform their businesses to be data-driven and leverage the benefits of Big Data, they quickly realizing that lack of Big Data centric data scientists and wranglers is blocking their value attainment. One of the limiting factors in Big Data resources, are the skills that have typically been required to leverage Big Data. Java and MapReduce both have been required skills and created a barrier to leveraging Big Data. This barrier exists despite the fact most companies have a number of DBAs, data analysts, and developers that are familiar with more traditional SQL-based data access and integration tools. Providing tools that allow these untapped resources to access Big Data in the environments that they are already trained, will allow companies to quickly address, in part, the data scientist gap.
Oracle has realized this need for Oracle-centric big data tools, and developed a number of connectors that enable this Oracle army to access and manipulate Big Data using tools and languages familiar to Oracle professionals. Oracle has 5 such connectors/technologies, each targeted at its own niche within the Oracle ecosystem.
- Oracle Loader for Hadoop – Allows DBAs and developers to easily bulk load data from Hadoop to Oracle without having to understand MapReduce, Yarn, or Java. This Hadoop-aware connector pushes-down Oracle data-type conversions to be processed on the Hadoop data nodes, thus, reducing the cpu utilization of the Oracle database during load-time.
- Oracle R Advanced Analytics for Hadoop – This connector is for the data scientists that are used to leveraging Oracle Advance Analytics and R in the Oracle Database. It provides a single interface to combine data from HDFS, Hive, Oracle RDBMS, and other supported data sources into one analytic tasks or set of tasks.
- Oracle SQL Connector for HDFS – SQL is the language most Oracle Professionals are familiar, and having a connector that enables HDFS data to be combined with data within an Oracle database using a single query greatly opens up access to Big Data.
- Oracle XQuery for Hadoop – Many logs and machine data are stored in XML or Json structures. The Oracle XQuery connector extends Oracle’s XML Query search technology to HDFS and Oracle NoSQL databases allowing access to these popular structures.
- Oracle Data Integrator for Big Data – Although not a connector, this product enables ODI professionals to leverage HDFS data as they would any other data source. Like other Big Data technology from Oracle, the ODI connector is Hadoop-aware and pushes transformational processing down to the Hadoop data nodes.
If you are an enterprise with significant investment in Oracle technologies and are pursuing Big Data, do not ignore the array of Big Data connectors from Oracle. These connectors enable your Oracle professionals to quickly utilize their Oracle and existing data knowledge to provide value from your Big Data investment.