The Netezza Organize on is one of important feature introduced in the Netezza data warehouse appliance. With the help of Organize on keys, you can create the clustered base tables (CBT). You can specify up to four columns in organize on key.
Read:
- Cluster Based Tables – CBT
- Netezza TwinFin Architecture
- IBM Netezza nzsql Command and its Usage
- Netezza ALTER TABLE Command
Netezza Organize On Keys
IBM Netezza Organizing keys are useful for the more frequently used columns that you specify as keys are used in query predicates, alone or in combination, and if the column cardinality is high that is, if the columns have many different values.
Netezza Organize On Syntax
You can specify the Netezza Organizing Keys when creating the table or using ALTER table command.
Netezza Organize On while Create Table
CREATE TABLE <tablename> (col1 int, col2 int, col3 int) DISTRIBUTE ON RANDOM ORGANIZE ON (<col>) ;
Netezza Organize On Example
create table test1 (col1 int, col2 varchar(20)) distribute on random ORGANIZE ON (col1,col2) ;
Netezza Organizing Keys using ALTER TABLE
ALTER TABLE <table> <action> [ORGANIZE ON {(<columns>) | NONE}];
Netezza Organizing Keys using ALTER TABLE Example
ALTER TABLE test1 ORGANIZE ON (col1, col2);
Netezza Organizing Keys Performance Impact
For Netezza organizing keys to have a positive impact on the table scan performance, a Netezza nzsql query does not have to refer all the organizing key columns. If you use just one of these column in a query predicate (WHERE clause) is enough.
Read:
- Netezza Best Practices to Improve Performance
- Analyze and Optimize Netezza Query Performance
- Improve Performance with help of Netezza Query Plan
However, if you use the organizing key column in the WHERE clause of the Netezza nzsql queries then the performance will be higher. If there is a sorted data, response time improves significantly compared to the unsorted case if you have many queries running in parallel on a table.
Rule of thumb for choosing Netezza organizing key
If many different queries in your workload use WHERE clause predicates (range or equality predicates) on the same column, then use this column for your organizing key.