Amazon Redshift Distribution Types and Examples

Amazon Redshift is a cluster of nodes with separate disks, memory and CPU. Redshift distributes optimally across all the nodes based on the type of distribution you choose on the table or materialized views. In this article, we will learn about amazon redshift distribution types and some of examples.

Amazon Redshift Distribution Types

There are three distribution types available in the Amazon Redshift; EVEN, KEY, ALL and AUTO. You can choose any methods based on your requirement and type of joining that you are going to perform on the tables.

You may also read:

Redshift EVEN distribution

In this type of Redshift distribution, leader node distributes the data to all data slices in a round-robin fashion. The even distribution is appropriate when you are not using the table in any kind of joins. You can choose even distribution in case if you are not clear on using KEY or ALL distribution.

Redshift Even distribution Example

Below is the example to create table with EVEN distribution:

create table sample 
(
id int,
name varchar(100),
age int
)
DISTSTYLE EVEN;

Redshift KEY distribution

In Redshift KEY distribution, rows are distributed according to the values in one column. The leader node will place the rows to same data slice. If two tables are distributed on the same column and when you join those two tables on distribution column then the required data is available in same data slice thus making collocated tables. The collocated tables improve the performance.

Redshift KEY distribution Examples

Below is the example to create table with KEY distribution:

create table sample 
(
id int,
name varchar(100),
age int
)
DISTSTYLE KEY
DISTKEY(ID)
;

Redshift ALL distribution

If you specify the ALL distribution style during table creation then leader node distributes the copy of tables every node available in the cluster. If the table is small and want make collocated tables then this distribution style is optimal.

The table loading process will take longer time if you have the table distributed on ALL style.

Redshift ALL distribution Example

Below is the example to create table with ALL distribution:

create table sample 
(
id int,
name varchar(100),
age int
)
DISTSTYLE ALL
;

Automatically Pick the Best Distribution Style

The latest version of Amazon Redshift can now automatically assign an optimal distribution style based on the size of the table data. With the automatic selection of the right distribution style, you get better query performance and storage space utilization across nodes.

Read:

This Post Has 2 Comments

Aurelien Duarte November 29, 2017

Hi Vithal,

Your comment on small tables for the All distribution should be removed as it is misleading.

Refer to: http://docs.aws.amazon.com/redshift/latest/dg/c_choosing_dist_sort.html “Small dimension tables do not benefit significantly from ALL distribution, because the cost of redistribution is low.”
1. Vithal Sampagar November 30, 2017
  
  Hi,
  
  Thank you for the information.
  
  In MPP environment collocation of tables improves join performance and each node work independent to each other. If your table is very small and joining that table in query then ALL distribution is optimal.
  
  Thank you

Comments are closed.