DBT – Export Snowflake Table to S3 Bucket

  • Post author:
  • Post last modified:January 19, 2023
  • Post category:General
  • Reading time:7 mins read

dbt stands for data build tool is a data transformation tool that enables data analysts and engineers to transform data in a cloud analytics warehouse. dbt basically focuses on the Transformation part in ELT (Extract, Load, Transform) processes. It supports cloud data warehouses such as Snowflake, Redshift, etc. In this article, we will check how to export your Snowflake table to S3 bucket from dbt.

DBT - Export Snowflake Table to S3 Bucket
DBT – Export Snowflake Table to S3 Bucket

What is dbt?

Before going into details on exporting Snowflake table to S3 bucket using DBT, let us check what is dbt first?

DBT (Data Build Tool), is a data transformation tool, which focuses mainly on the Transformation part in ELT (Extract, Load, Transform) processes. It performs various transformations using select statements, which in turn converts into tables and views, thus making the transformation process simple and effective.

DBT – Export Snowflake Table to S3 Bucket

DBT mainly performs transformations using SELECT statements. But Snowflake uses COPY INTO command to export data from Snowflake table. DBT does not allow COPY INTO command, but as a workaround you can use DBT macro capabilities to execute COPY INTO command.

As mentioned, it provides an option to write macros that you can use to perform various complex transformations such as complex calculations, executing multiple SQL statements, etc. You can also write custom macro using pre_hook or post_hook for merge / update / delete and other SQL statements.

This is a guide to walk you through the how to export Snowflake table to S3 bucket using dbt custom macros.

Now, let us check these steps in details.

Step1: Create Snowflake External Stage

This external stage will reference the Amazon S3 bucket where you will unload the Snowflake table. External stages are database and schema specific, so choose appropriate database and schema where you want to create external stage.

You can use following SQL to create an external stage in a snowflake.

CREATE OR REPLACE STAGE OUTPUT_STAGE
   url='s3:[S3_PATH_HERE]'
   credentials=(
     aws_key_id='###'
     aws_secret_key='###')
     file_format = (type = csv);

Once created, it will be available in the mentioned database and schema. You can verify using SHOW STAGE command.

Step2: Create Macro to Use COPY INTO Command

Macros in Jinja are pieces of code that is analogous to “functions” in other programming languages, and are extremely useful if you find yourself repeating code across multiple models. Macros are defined in .sql files, typically in your macros folder under your projects.

You can use the following macro to execute Snowflake COPY INTO command from dbt.

{% macro unload_to_s3(t) %}

    {{ log("Unloading data", True) }}
    {% call statement('unload_test', fetch_result=true, auto_begin=true) %}
	copy into @OUTPUT_STAGE/{{t}}
    from {{t}}
    overwrite = true
    {% endcall %}
    {{ log("Unloaded data", True) }}

{% endmacro %}

Got this macro from here.

Step3: Export Snowflake Table Using DBT

A model is a select statement. Models are defined in .sql files which are typically stored in your models directory. Let us use the macro in our model file to execute unload command.

In the following example, we will use custom macro to execute the macro and we will also keep dummy SQL statement otherwise, we will end with an error “Tried to run an empty query on model“.

{{
  config(
    materialized='run_custom_sql'
  )
}}
select * from {{ref('TEST_V')}};
{{ unload_to_s3('TEST_V') }}

Above SQL file will create a dummy view and executes the COPY INTO command to export Snowflake table.

Related Articles,

Hope this helps 🙂