How to Create an Index in Amazon Redshift Table?

  • Post author:
  • Post last modified:April 14, 2021
  • Post category:Redshift
  • Reading time:4 mins read

Indexing is a way to optimize the databases. It will reduce the disk I/O when you execute a query to search particular value. Basically, indexing is a data structure technique which you can use to quickly locate and access the data in a database. Many traditional relational databases such as Oracle, SQL Server, etc support indexes. But, being a columnar database, Amazon Redshift does not support indexing on the tables. In this article, we will check alternate approach to create an index in the Amazon Redshift table.

How to Create an Index in Amazon Redshift Table?

Amazon Redshift is a columnar database specifically designed for data warehousing. Redshift does not support regular indexes, instead you will use different data distribution and sortkey technique to reduce disk I/O.

Amazon Redshift support unique key, primary key, foreign key constraints. But, they are informational only. Redshift does not enforce these indexes. However, the query optimizer uses these constraints to create an optimal plan for the query execution.

Read more about different constraints and how to create them in my other articles

Redshift SortKey as an Index Alternative

A sort key is a field in your Redshift database table that determines the order in which the data is physically stored in the database disk. Amazon Redshift stores your data on disk in sorted order according to the sort key that you mention while creating Redshift table.

Following Redshift create table syntax define sortkey.

CREATE TABLE customertest ( 
customer_id INTEGER NOT NULL, 
name VARCHAR(100) NOT NULL,
country_id INTEGER NOT NULL 
) SORTKEY(customer_id);

Redshift stores data on disk in sorted order according to sort key. i.e. customer_id.

You can read more about selecting sortkey in my other article.

How to Add Sort Key to Existing Table?

In Redshift database, You cannot add a column that is the distribution key (DISTKEY) or a sort key (SORTKEY) of the table.

The only option is to create a new table with sort key and insert data from old table to the new table.

Related Articles,

Hope this helps 🙂