Redshift Table Data Skew and How to avoid it

You will hear a lot about "Data Skew" if you are developing data warehouse on Redshift, Netezza, Teradata, hive or Impala database. In the MPP database, performance of the system is directly linked to uniform distribution of the user data across all data node in the system. When you create a table and then load the data into the system, the rows of the table should be distributed uniformly among all the data nodes. If some data node slices have more rows of a table than others, this scenarios is…

Continue ReadingRedshift Table Data Skew and How to avoid it
Comments Off on Redshift Table Data Skew and How to avoid it

Netezza Skew and How to avoid it

You will hear a lot about "Netezza Skew" if you are developing data warehouse on Netezza, Redshift, Teradata, hive or Impala database. The performance of the system is directly linked to uniform distribution of the user data  across all of the data slices in the system. When you create a table and then load the data into the system, the rows of the table should be distributed uniformly among all the data slices. If some data slices have more rows of a table than others this scenarios is called skew.…

Continue ReadingNetezza Skew and How to avoid it
Comments Off on Netezza Skew and How to avoid it