Redshift Table Data Skew and How to avoid it
You will hear a lot about "Data Skew" if you are developing data warehouse on Redshift, Netezza, Teradata, hive or Impala database. In the MPP database, performance of the system is directly linked to uniform distribution of the user data across all data node in the system. When you create a table and then load the data into the system, the rows of the table should be distributed uniformly among all the data nodes. If some data node slices have more rows of a table than others, this scenarios is…