Skip to main content

Technical

Creating Snowflake External Stage Using Google Cloud Storage

Istock 1366415101

If you are a Google cloud platform (GCP) user and want to use GCS as the external stage to load/unload data to snowflake, this blog will help you understand the step-by-step procedures.

Snowflake-External-Stage-Using-Google-Cloud-Storage

Before going any further on unloading and loading data from GCP. Let us understand what is Load/Unload? and what are the supported cloud providers through which you can load/unload your data and query using Snowflake?

Loading: Loading of data refers to moving data into the Snowflake table through staging, where we ingest the data from cloud provider’s storage bucket or from internal stage.

Read to know more about loading.

Unloading: The unloading of data refers to moving data from snowflake to the cloud provider’s storage bucket using Data Staging.

Cloud providers supported by Snowflake

  • Amazon Web Services
  • Microsoft Azure
  • Google Cloud Platform (Preview Mode)

For loading/unloading data to/from snowflake, we need to create integration objects to integrate GCP to Snowflake and an external staging area to temporarily keep and migrate data.

Creating Integration and External Stage:

  • Log into snowflake web console and switch your role as Account Admin
  • Create integration object by giving parameters like type of stage (i.e. External), storage provider (i.e. GCS), blocked locations, allowed locations (here all locations are allowed),etc in snowflake worksheet

replace-storage

  • Below query is used to describe the integration object (here ‘gcs_int’)

             DESC STORAGE INTEGRATION <INTEGRATION NAME>;

  • From the results pane, note down service account name (This is required while adding member to the cloud bucket permissions. Every snowflake account has its unique service account)

desc-storage-integration

  • Log into Google cloud platform, click the ‘Navigation Menu’ -> ‘IAM & admin’ and then ‘roles’

Google-cloud-platform1

  • Create a new role, select permissions and click ‘Create’. Let us give ‘Snowflake_role’ as the role name. Choose all the assigned permissions.

assigned-permission

  • Now go to ‘Navigation pane’ and select ‘Storage’ then click on ‘Browser’ and choose the bucket you want to integrate with the role

storage

  • After choosing the bucket, click show info panel at the top right corner and then ‘add member’

bucket

  • Now in the add member pane, enter the service account name that was received while describing the integration in snowflake console and choose the role you have already created(role can be found in custom role menu, in this case it is ‘Snowflake Role’)

snowflake-role

  • You can also give permissions to other roles to use integration and to create stage to load/unload data using below query
  1. grant usage on integration gcs_int to role Sysadmin;

  2. grant create stage on schema public to role Sysadmin;

  • Now create an external stage

external-storage

  • Use ‘list @stage_name’ to list all the files located in your Google cloud bucket in the results pane

We are all set to load/unload data to/from GCP using Snowflake.

Loading Data into the Table:

  • Create a ‘File Format’ in Snowflake by using WebUI or native SQL command to access the respective file format that is stored in the bucket

Note: In our case, DB Name is ‘Shivam_data’, schema name is ‘Public’ and file format is ‘CsvFile’

simple-table

  • Finally, use ‘Copy Command’ to load the data from GCS bucket into the destination table(here it is ‘SAMPLETABLE’)

destination-table

  • You can verify the data in the destination table using simple select statement

Now, let us unload the data from ‘SampleTable’ into GCS Bucket.

Unloading Data to GCP Bucket:

  • Snowflake auto-compresses the data to be unloaded

compressed-data-unload

  • If compression is not required, choose compression as false and can unload your data into various file formats. Note the change in output_bytes.

data-unload-simplest-file

The unloaded files can now be found in your Google Cloud Bucket:

google-cloud-bucket

 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Adarsh Srivastava

Adarsh Srivastava is a Senior Business Consultant with over 6 years of content-related experience working in the travel, healthcare, and information technology sectors. Upholding the principles of sincerity, punctuality, and compassion through his writings, he delivers conversational and fact-based content. He is also the author of two books – one of which is a romantic thriller and another an anthology. Feel free to reach out to him for any form of communication.

More from this Author

Follow Us