How to Change Redshift Table Distribution Style and Example

  • Post author:
  • Post last modified:September 18, 2019
  • Post category:Redshift
  • Reading time:6 mins read

Choosing right distribution style is one of the important factors to improve the performance of Redshift Database. Distribution style will directly affect the performance of your query. Table with wrong distribution style might hamper your Redshift cluster because of workload. In this article, we will check how to change Redshift table distribution style with an example.

Why to Change Redshift Table Distribution style?

There are three types of distribution style available in Redshift:

  • EVEN distribution
  • KEY distribution
  • ALL distribution

You can choose any of the style based on your data, size and performance considerations.

Most common distribution style is EVEN, it is used when you are not sure about which style to choose. If you have created the table with EVEN distribution or with different column with lots of duplicate records, then you should immediately change the distribution style otherwise that will reduces the performance. Changing Redshift table distribution style is a process of redistributing the Redshift.

Change Redshift Table Distribution style Example

As mentioned earlier, you cannot change the Redshift table distribution using alter table column. You have to redistribute the table data using CREATE TABLE AS command with new distribution style.

For example, consider below CTAS example to redistribute the table data in Redshift.

create table event_new_dist
distkey (eventid)
sortkey (eventid, dateid)
as
select eventid, venueid, dateid, eventname
from event;

Note that, this works well with a table which has a relatively small amount of data. If your table is very large, you can follow below section.

Related Articles

Change Distribution of Large Table in Redshift

If your table is very large, then it will be very difficult to redistribute table because of space availability.

Follow below steps to redistribute very large table:

  • Break up Your Table Data

In this step, first, create the smaller tables with filtered data. You can use date range for the same.

For example, consider below example to create a temp table and insert with a few records.

create table event_temp
distkey (eventid)
as
select eventid, venueid, dateid, eventname
from event
limit 0;
insert into event_temp
select eventid, venueid, dateid, eventname
from event
where event_time between '2019-01-01' 
and '2019-06-30';

Now, use same date range to delete from base table as the same data is transferred to a temp table with new distribution style.

delete from event
where event_time between '2019-01-01' 
and '2019-06-30';

Repeat the process until events table is empty.

  • Drop Very Large Table

We have an entire data from base table into temp table. You can drop the base table.

For example, drop events table.

drop table events;
  • Rename Temp Table to Original Table

Now, rename the temp table to original table name.

For example, consider below example.

alter table event_temp rename to events;

Now, you have events table with different distribution style.

Related Articles

Hope this helps 🙂