Loading Data into Oracle Autonomous Data Warehouse using OAC - Perficient Blogs
Blog
  • Topics
  • Industries
  • Partners

Explore

Topics

Industries

Partners

Loading Data into Oracle Autonomous Data Warehouse using OAC

In this blog post I will show you some exciting functionality that is available in Oracle Analytics Cloud (OAC).  It is the kind of functionality that you rarely see in front end reporting and visualizations platforms – either before the Cloud or even after the advent of the Cloud.  It greatly empowers end users to accomplish data management tasks with little to no assistance from IT.

Oracle ADW stands for ‘Autonomous Data Warehouse’.  ADW is Oracle’s Cloud-based, fully managed, high performance Cloud data warehouse that runs on Oracle Exadata specialized database machines.  An ADW data warehouse can be provisioned in minutes, can scale to petabyte level, can be turned on and off by the end user and supports cloning of entire databases in a couple of minutes.

In this blog post I will show you how to load data into ADW using only Oracle Analytics Cloud (OAC) functionality.  No code will need to be written.  I won’t have to use SQL development tools.  I won’t even have to create the target table in ADW.  OAC will do that for me.  Also, I am not loading just one file.  I will load two files and join them on a common value before I load them into the target ADW table (similar to using vlookup to join two Excel sheets on a common value).

I will do this using the free trial versions of Oracle ADW and OAC.  So anyone with a credit card can immediately try this themselves (your card will not automatically start to be charged after the end of the free trial.  Your trial will expire after 30 days without a charge unless you choose to continue and pay going forward).

Let’s get started!

Set Up the Trial Versions of Autonomous Data Warehouse (ADW) and Oracle Analytics Cloud (OAC)

  1.  First we will sign up for the free trial versions of ADW and OAC.  Please go to https://cloud.oracle.com/tryit.  Click on ‘Create your free account.’  Just follow the prompts to create your free Oracle Cloud account.  You will be asked to provide an email address and a credit card number (again-just to reiterate-your credit card will not be charged).   This account will allow you to access both ADW and OAC.
  2. Shortly after signing up for your free account you will receive an email like the one below from Oracle Cloud with your access details and a temporary password.  Click on ‘Get Started with Oracle Cloud’ to establish your permanent password and begin to set up your ADW and OAC instances.
  3. After setting up your permanent password you will be brought to a screen like the one below where you can create your ADW instance.  Click on ‘Autonomous Data Warehouse – Create a data warehouse’.
  4. The next 4 screen shots include everything you need to define and create your ADW instance.  It is very straightforward.  On the first screen below you can leave everything as the default unless you want to change the display name of the ADW instance (which is what I have done below).
  5. On the next screen you should make sure that ‘Data Warehouse’ is checked.  You can configure the database as desired.  The defaults for both CPU core count and Storage (TB) are 1 each.  Typically this is enough for a trial system.  I have changed CPU core count to ‘2’ in the screen shot below.  I have also turned on auto scaling which is off by default.  If this is something you are interested in testing then you should turn it on.
  6. On the next screen you will be asked to create the password for the ADMIN user.  This is important and not something you want to lose or forget so please document it somewhere safe.
  7. On this last set up screen you should choose ‘Bring Your Own License’ as the License Type.  That’s it.  No more set up info is required.  Just click ‘Create Autonomous Database’ to start the process of provisioning your ADW instance.
  8. At first you will get a screen like the one below saying your ADW instance is provisioning.
  9. After a minute or two (or sometimes a bit longer) provisioning will complete and you will get a screen like the one below.  That is it.  You now have an Oracle autonomous data warehouse instance.
  10. Although you now have an ADW instance you will not be able to access it unless you have the proper credentials.  So this next step is very important.  In this step we will download a wallet file that contains the credentials for accessing our ADW instance.  Click ‘Download’.   The password for these downloaded credentials is the password established for the ADMIN user in step 6.
  11. Save the downloaded wallet zip file in a location you will remember.  It is not necessary to extract the files from the zip file. That is everything that needs to be done for ADW.  We will now turn our attention to creating our OAC instance.
  12. Now that we have an ADW instance it is time to provision and create our OAC instance.  Go to the ‘hamburger’ in the upper left corner and scroll down to ‘My Service Dashboards’ and click on it.
  13. On the screen below, click ‘Create Instance’.
  14. Then on the screen below click ‘Create’ on the ‘Analytics’ button.
  15. Finally on the screen below click ‘Create Instance’. You will be taken through a few basic questions to set up and configure your OAC instance.
  16. You will be asked to provide a name for your instance.  Choose a region closest to you (e.g., if on the East Coast select ‘us-ashburn-1’ which is in VA).  For Edition select ‘Oracle Analytics Cloud-Enterprise Edition’.  For Feature Set select ‘Business Intelligence – Enterprise Data Models’. Click ‘Next’.
  17. Review all the information on the screen below to make sure it is accurate and then click ‘Create’.
  18. You will get a screen like the one below that will appear while your OAC instance is being provisioned.  You can refresh the screen by clicking the circular arrow icon in the upper right hand corner.
  19. When your OAC instance is ready you will get a screen like the one below that will display the name of your OAC instance.  Click ‘Oracle Analytics Cloud URL’ to open and go to your OAC instance.

Establish Connection from OAC to ADW

  1. You will be brought to the screen below which is the home screen for OAC.  We are now ready to start the activities required to load data into our Oracle ADW instance.  The first thing we will do is establish a connection from OAC to our ADW instance.  Click on the ‘Create’ button in the upper right corner and then click on the ‘Connection’ icon.
  2. You will get the screen below.  Oracle Analytics Cloud can connect to many, many different data sources.  In this case you will want to click on the icon called ‘Oracle Autonomous Data Warehouse Cloud’.
  3. This next screen is where you will establish the connection to ADW from OAC.  Input a Connection Name of your choosing.  For ‘Client Credentials’ select the zip file that you downloaded in steps 10 and 11 above (you don’t have to extract the files out of the zip file-you can just select the zip file).  The password will be the password you established for the ADMIN user in step 6.  You can leave Service Name as the default. Click ‘Save’.
  4. You can check that the connection exists by clicking on the ‘hamburger’ in the upper left corner.  Then click on ‘Data’ then click on ‘Connections’.  You will see your connection listed.  You now have a working connection to ADW from OAC.

Upload Excel Files from Local Drive to OAC Data Set Storage Area

  1. Next we will upload two Excel files from our local drive into OAC’s data set storage area.  It depends on configuration and security by person but OAC can store approximately 250 GB of files.  This storage is separate from ADW.  OAC’s data set storage is like storage on your hard drive except in the Cloud.  After we upload these two Excel files into OAC’s storage area we will then load them into ADW for permanent storage and analysis.  Click on ‘Create’ in the upper right corner and then click on ‘Data Set’.
  2. You will get the screen below.  Click on ‘Drop data file here or click to browse’ and select an Excel file from your local drive.
  3. You will then get a screen that looks like this.  I have selected an Excel file to upload called New Company Sales Data.xlsx.  It has two sheets and in the screen below I will be uploading the first sheet of the Excel file.
  4. Before I upload this data I need to do just a little data prep.  BASKET_ID and PROD_ID are numeric values so by default OAC is planning to treat them like measures.  However, they are really attributes.   So I will click on each one, go to the left hand side and change ‘Treat As’ from Measure to Attribute.  Please analyze the data set you will upload and make sure only numeric data that you plan to summarize will be treated as a measure. Every other column should be treated as an attribute.  Click ‘Add’ in the upper right hand corner to upload this to OAC.
  5. Let’s check to make sure the data set has been uploaded.  To do that I will click on the ‘hamburger’ in the upper left hand corner.  Then I will click on ‘Data’ and then ‘Data Sets’.  Now I can see the file I uploaded.
  6. Now I will repeat the same upload process to upload a mapping table that is on the second worksheet in the same Excel file.  If you don’t want to try out the ‘join’ functionality then you can skip the uploading of a second file.  Click on ‘Create’ in the upper right hand corner.  Then click on ‘Data Set’.  You will get the screen below.
  7. I will select the same file from my hard drive called ‘New Company Sales Data.xlsx’ and I will upload the second worksheet in the file.  The second worksheet is a mapping table that maps Product Category to Product Type.  Product Type was not an attribute in the first file I uploaded but I want to analyze sales by Product Type.  So I need to add Product Type to the file with the sales data.  To do this we will use standard OAC functionality to apply the Product Category to Product Type mapping table to the sales data before we load it into ADW.  The resultant table in ADW will have a column for Product Type and each row will have a value in that column.  The mapping table does not require any data prep before uploading.  However, I do need to make sure I select the correct worksheet from within the Excel file for upload.  Click on ‘Add’ to upload the mapping table to the OAC storage area.
  8. We now have two data sets that have been uploaded to the OAC storage area and 1 connection set up from OAC to ADW.  We have all the ingredients we need to go ahead and load this data into ADW from OAC. The screens below display the two data sets that have been uploaded to the OAC Storage Area and the OAC to ADW connection.  Click on ‘Create’ and then Click on ‘Data Flow’.

Set up and Run Data Flow in OAC to Load Data into ADW

  1. After you click on ‘Data Flow’ you will get the screen below.  We are going to use standard OAC functionality to set up a Data Flow to join the two files together and then load the resultant data set into a single table in ADW.  Once we set up the Data Flow we need to ‘run’ the Data Flow to actually load the data into ADW (everything prior to running the Data Flow is just set up).  Data Flow functionality in OAC is very powerful.  It is possible to create very complex Data Flows with multiple branches, aggregations and data transformation.  The Data Flow we will create will be relatively simple with a step to join the two data files on a common key and then a step to load the results into ADW.  The first step is to click on ‘New Company Sales Data’ and then click on ‘Create Data Set’.
  2. The ‘New Company Sales Data’ data set is now added to the Data Flow as can be seen in the screen print below.  Now it is time to add the second data set to the Data Flow.  As you will recall the second data set is a mapping table.  In order to add the second data set, we need to drag and drop the ‘Add Data’ icon from the top of the left hand menu just under ‘Data Flow Steps’ to the white section of the Data Flow just under the blue icon for New Company Sales Data.
  3. After executing the drag and drop of the ‘Add Data’ menu option, we will get a screen like the one below.  In this case, I will select the second data set – Product Category to Type Map.  Click ‘Add’.
  4. After clicking ‘Add’ we will get the screen below.  You can see the second data set below the first data set in this Data Flow.
  5. The next step is to join these two data sets together on a common value so the result set will contain Product Type.  To do this we will drag and drop ‘Join’ from the menu on the left to the white portion of the Data Flow.  After doing so we will get the screen below.  OAC will indicate to us that the join is incomplete and it will suggest a node to click on to define the join criteria.  Click on the small circle with the light blue outline to define the join criteria.
  6. After clicking on the light blue circle we will get the screen below.  OAC will take an initial guess at what columns match between the two files.   In this case, OAC’s initial guess at matching columns is wrong.  We need to change ‘BASKET_ID’ on Input 1 to be ‘PROD_CATEGORY’.
  7. Now that we have successfully defined the join criteria OAC is able to join the two data sets together and display the resultant rows for us.  Please notice that the resultant rows that are displayed contain all columns from both data sets (so Product_Type is in the result set).
  8. Now we have come to a very important step.  This is where will will set up the Data Flow to save the result of the join to ADW.  The first step to do this is to drag ‘Save Data Set’ from the left hand menu to the white portion of the data flow over the ‘+’ sign to the right of ‘Join’.  Then we will provide a name for the data set that will result from the join (we have called it ‘sales_trans_cust’).  Then importantly we need to designate whether the data set should be saved to OAC data set storage (like the files we uploaded) or to a database connection.  Of course, we should select ‘Database Connection’.  Then click on ‘Select Connection’ which will appear right below ‘Save Data to Database Connection’ after Database Connection is selected.
  9. After clicking on ‘Select Connection’ we will get the screen below.  Since we have only one database connection set up so far in this trial version, we have only one to choose from.  Select the connection to ADW that we previously set up.
  10. In this step we will do two very important things.  First we will define the table name (should be the same as the data set name we defined in step 39 – i.e., sales_trans_cust).  We will also decide whether to replace existing data when this Data Flow is run or to add new data to existing data.  We will select ‘Add new data to existing data’.  Now, before we run this Data Flow let’s save it.  Click on ‘Save’.
  11. We will give the Data Flow a name and click on ‘OK’.
  12. Let’s check to make sure the Data Flow is saved.  Click on the ‘hamburger’ in the upper left corner.  Click on ‘Data’.  Click on ‘Data Flows’.  You should see the Data Flow in the list of Data Flows.  Click on your Data Flow to open it.
  13. Now it is time to run the Data Flow to load the data into ADW.  Click on ‘Run Data Flow’ in the upper right corner.
  14. A screen like the one below will appear while the Data Flow is running.  Even though the data being loaded is close to 1 million records with join processing included, the load into ADW took less than 1 minute.
  15. After the load is completed, let’s check to make sure the data was loaded successfully.  First let’s go to the ‘Data Set’ area of OAC.  Click on the hamburger in the upper left corner.  Click on ‘Data’.  Then click on ‘Data Sets’.  You will see a data set called ‘sales_trans_cust’.  This is a different type of OAC data set from the Excel files that were uploaded.  You can see that the icon is different for ‘sales_trans_cust’.  It is using the ‘connection’ icon because the data for ‘sales_trans_cust’ is not stored in the OAC data set storage area.  The data for ‘sales_trans_cust’ is stored in ADW.

Use OAC to Create Data Visualization Against Data in ADW for Analysis

  1. Now let’s create a Data Visualization project to analyze the data in sales_trans_cust in ADW.  Click on ‘Create’ and then click on ‘Project’.
  2. Click on ‘sales_trans_cust’ as the data set.
  3. On the screen below you can see that sales_trans_cust is available for creating visualizations.  The data is stored in ADW and is the result of joining two files together via the OAC Data Flow we created.  We did not have to create the table in ADW – OAC did that for us.
  4. For those not familiar with how to create data visualizations in OAC, just drag and drop data elements onto the canvas and OAC will suggest the best visualization.  Hold down the ‘Ctrl’ key when you select more than one data element to be dragged onto the canvas.  Below are some visualizations you can create from the data that was loaded into ADW.
  5. Now, purely for proof purposes, I have included a screen print from Oracle SQL Developer where I am directly querying the table called ‘sales_trans_cust’ in ADW (i.e., not querying through OAC).  You can see that the table exists in ADW and that it has the same total for AMOUNT_SOLD as appears in the OAC data visualization project.

There is a tremendous amount of powerful functionality in ADW and OAC.  The trial versions are a great way to try out the functionality with no strings attached.  Explore and have fun!

Director - Oracle Business Analytics

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Subscribe to the Weekly Blog Digest:

Sign Up