Amazon Redshift External tables allow you to access files stored in S3 storage as a regular table. You can join the Redshift external table with a database tables such as permanent or temporary table to get required information. You can also perform a complex transformation involving various tables. The External tables are commonly used to build the data lake where you access the raw data which is stored in the form of file and perform join with existing tables.
There are many situations in which you need to access the data without loading it to Redshift. You can simply create external tables using Redshift Spectrum and access a flat file as a regular table within Redshift schema. You can even export a Redshift table by using external table in AWS Redshift
Create External Tables in Amazon Redshift
You can create a new external table in the specified schema. All external tables must be created in an external schema. Once an external table is available, you can query it as if it is regular tables.
The following steps allow you to create external tables in Amazon Redshift:
Now, let us check these steps:
Create an External Schema and Database
You can use create external schema syntax to create external schema within the current database.
Following SQL code creates an external schema spectrum_schema_vs.
create external schema spectrum_schema_vs from data catalog
database 'spectrum_db_vs'
iam_role 'arn:aws:iam::1234567890:role/RedshiftRole'
create external database if not exists;
External schema is visible in the list of the current database schema.
For example,
Create External Table in an External Schema
You can create a new external table in the specified external schema using CREATE EXTERNAL TABLE command.
Following SQL code creates an external table in spectrum_schema_vs external schema.
create external table spectrum_schema_vs.ext_users (
user_id int,
SSN varchar,
first_name varchar,
last_name varchar,
city varchar,
state varchar,
email varchar,
phone varchar,
like_sports boolean,
like_theatre boolean,
like_concerts boolean,
like_jazz boolean,
like_classical boolean,
like_opera boolean,
like_rock boolean,
like_vegas boolean,
like_broadway boolean,
like_musicals boolean)
row format delimited
fields terminated by ','
stored as textfile
location 's3://vithal/data/';
The external table is visible in the list of the tables within external schema.
For example,
Access External Table
Once the external table is created, you can access it as if a Redshift database regular table.
For example, display the top 10 records from the flat file using external table.
List External Tables Available on Amazon Redshift
You can query the svv_external_tables
Redshift system table to display the external tables available on AWS Redshift.
For example,
dev=# select schemaname, tablename, location from svv_external_tables;
schemaname | tablename | location
--------------------+-----------+------------------
spectrum_schema_vs | ext_users | s3://vithal/data
(1 row)
Export Redshift Tables Using External Tables
You can unload the Amazon Redshift tables to S3 folder by using CREATE EXTERNAL TABE AS command.
For example, following command will unload the unique_users table to an external S3 location. This method is different from COPY command or UNLOAD command.
CREATE EXTERNAL TABLE spectrum_schema_vs.unique_users_exp
row format delimited
fields terminated by ','
stored as textfile
location 's3://vithal/data/unload/'
AS SELECT distinct * FROM public.unique_users;
Related Articles,
- How to Export Redshift Data to JSON Format?- Method and Example
- Redshift vs Snowflake – Key Differences
- How to Optimize Query Performance on Redshift?
- Different Methods to Create Redshift Tables – Examples
- Export Redshift Table Data to Local CSV format
- How to Query JSON Data in Redshift? – Examples
- How to Create a Materialized View in Redshift?
Hope this helps 🙂